Re: [HACKERS] Parallel Seq Scan
On Thu, Jan 22, 2015 at 7:23 PM, Robert Haas robertmh...@gmail.com wrote: On Thu, Jan 22, 2015 at 5:57 AM, Amit Kapila amit.kapil...@gmail.com wrote: 1. Scanning block-by-block has negative impact on performance and I thin it will degrade more if we increase parallel count as that can lead to more randomness. 2. Scanning in fixed chunks improves the performance. Increasing parallel count to a very large number might impact the performance, but I think we can have a lower bound below which we will not allow multiple processes to scan the relation. I'm confused. Your actual test numbers seem to show that the performance with the block-by-block approach was slightly higher with parallelism than without, where as the performance with the chunk-by-chunk approach was lower with parallelism than without, but the text quoted above, summarizing those numbers, says the opposite. Sorry for causing confusion, I should have been more explicit about explaining the numbers. Let me try again, Values in columns is time in milliseconds to complete the execution, so higher means it took more time. If you see in block-by-block, the time taken to complete the execution with 2 workers is more than no workers which means parallelism has degraded the performance. Also, I think testing with 2 workers is probably not enough. I think we should test with 8 or even 16. Sure, will do this and post the numbers. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Re: [HACKERS] delta relations in AFTER triggers
On Thu, Oct 23, 2014 at 11:19 AM, Robert Haas robertmh...@gmail.com wrote: So what I'm imagining now is: 1. During parse analysis, p_tableref_hook gets control and calls addRangeTableEntryForTuplestore(), creating an RTE of type RTE_TUPLESTORE. The RTE stores an integer parameter-index. 2. Path generation doesn't need to do anything very exciting; it just generates a Path node of type T_TuplestoreScan. The RTE is still available, so the path itself doesn't need to know which tuplestore we're referencing, because that information is present in the RTE. 3. At plan generation time, we look up the RTE for the path and extract the parameter index, which is what gets stored in the TuplestoreScan node. 4. At executor initialization time, we use the parameter index in the TuplestoreScan to index into the EState's es_param_list_info and retrieve the tuplestore. I spent some time poking at this yesterday, based on commit 5060b9352b0d0301ffb002355f0572e93f8b05fe from https://github.com/kgrittn/postgres.git Here's where I got stuck: The plpgsql_parser_setup() callback sets pstate-p_ref_hook_state = (void *) expr, so if we add p_tableref_hook as an additional callback, that's what it has to work with to find the information needed to generate a RangeTblEntry. That is a PLpgSQL_expr, and it contains a pointer to the PLpgSQL_function, which is created when the function is compiled, which seems good, but the information we need is not there. Specifically, we need to know the names the user picked for the old and new tuplestores (tgoldtable and tgnewtable) and we need to know what the tuple descriptor should be, and the PLpgSQL_function hasn't got it. It does not seem impossible to fix that, but I'm not sure it's safe. do_compile() has the FunctionCallInfo, so from there it can get at the TriggerData and the Trigger. The trigger has got tgoldtable and tgnewtable, and the TriggerData has got tg_relation, so everything we need is there. We could add pointers to the relevant stuff to the PLpgSQL_function, and then the parser callbacks could get at it. However, I'm not sure that's really OK -- presumably, tg_relation is going to be valid only during the initial compile. If somebody came back and looked at that PLpgSQL_function again later, and tried to follow that pointer, bad things would happen. In practice maybe it would be OK because the only likely reason to come back and look at the PLpgSQL_function again is because we're recompiling, and at that point we'd have a new relation pointer to copy in there, and so it would probably be OK. But that feels mighty ugly. Another idea is to change what actually gets passed to the parser callback. Right now we just pass the PLpgSQL_expr. If we created a new structure that contained that plus the PLpgSQL_execstate, we'd be in fine shape. But this sort of gets at the root of the problem here: with variables, the parser callback doesn't return the actual *value*, it returns a Param node that will later, at execution time, be looked up to find the actual value. With relations, we're sort of doing the same thing - the tuplestore RTE doesn't need to contain the actual data, just the tuple descriptor. But the data structures from which we can get that information also contain the actual execution-time information, so passing them into parse-time callbacks seems like it might be, if nothing else, a recipe for future bugs. Any suggestions? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New CF app: changing email sender
Lovely! I've just pushed a change to the main website which now lets you change the email address there. That meas you can go to https://www.postgresql.org/account/profile/ and change the email. After you have confirmed the change (an email will be sent to your new address when you change it), the the new one should be valid on the cf app (if it doesn't show up, log out of the cf app and back in, and it should show up). I will look into Andrews issue of being able to have multiple email addresses soon as well - but one feature per evening :) It works perfectly. Thanks. regards, -- Kyotaro Horiguchi NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Seq Scan
On Thu, Jan 22, 2015 at 9:02 AM, Amit Kapila amit.kapil...@gmail.com wrote: I'm confused. Your actual test numbers seem to show that the performance with the block-by-block approach was slightly higher with parallelism than without, where as the performance with the chunk-by-chunk approach was lower with parallelism than without, but the text quoted above, summarizing those numbers, says the opposite. Sorry for causing confusion, I should have been more explicit about explaining the numbers. Let me try again, Values in columns is time in milliseconds to complete the execution, so higher means it took more time. If you see in block-by-block, the time taken to complete the execution with 2 workers is more than no workers which means parallelism has degraded the performance. *facepalm* Oh, yeah, right. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Seq Scan
On Thu, Jan 22, 2015 at 5:57 AM, Amit Kapila amit.kapil...@gmail.com wrote: 1. Scanning block-by-block has negative impact on performance and I thin it will degrade more if we increase parallel count as that can lead to more randomness. 2. Scanning in fixed chunks improves the performance. Increasing parallel count to a very large number might impact the performance, but I think we can have a lower bound below which we will not allow multiple processes to scan the relation. I'm confused. Your actual test numbers seem to show that the performance with the block-by-block approach was slightly higher with parallelism than without, where as the performance with the chunk-by-chunk approach was lower with parallelism than without, but the text quoted above, summarizing those numbers, says the opposite. Also, I think testing with 2 workers is probably not enough. I think we should test with 8 or even 16. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PL/pgSQL, RAISE and error context
Hello, I just heard that there's going to be a fifth CF for 9.5 so I'm trying to gather all the patches I'd like to see in 9.5.. On 8/23/13 10:36 AM, I wrote: My opinion at this very moment is that we should leave the the DEFAULT verbosity alone and add a new one (call it COMPACT or such) with the suppressed context for non-ERRORs. I wonder if a better option would be to add a GUC to control this from the server side. plpgsql.suppress_simple_error_context or such, defaulting to false to maintain full backwards compatibility. That could be set to true for development installations and for client programs which only care about having all information available, rather than readability or aesthetics. Or is that a stupid idea? I just think hacking libpq for something like this is a huge overkill. .marko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] [POC] FETCH limited by bytes.
Hello, as the discuttion on async fetching on postgres_fdw, FETCH with data-size limitation would be useful to get memory usage stability of postgres_fdw. Is such a feature and syntax could be allowed to be added? == Postgres_fdw fetches tuples from remote servers using cursor. The transfer gets faster as the number of fetch decreases. On the other hand buffer size for the fetched tuples widely varies according to their average length. 100 tuples per fetch is quite small for short tuples but larger fetch size will easily cause memory exhaustion. However, there's no way to know it in advance. One means to settle the contradiction would be a FETCH which sends result limiting by size, not the number of tuples. So I'd like to propose this. This patch is a POC for the feature. For exapmle, FETCH 1 LIMIT 100 FROM c1; This FETCH retrieves up to 1 tuples but cut out just after the total tuple length exceeds 1MB. (It does not literally LIMIT in that sense) The syntax added by this patch is described as following. FETCH [FORWARD|BACKWARD] ALL|SignedIconst LIMIT Iconst [FROM|IN] curname The data size to be compared with the LIMIT size is the summation of the result of the following expression. The appropriateness of it should be arguable. [if tupleslot has tts_tuple] HEAPTUPLESIZE + slot-tts_tuple-t_len [else] HEAPTUPLESIZE + heap_compute_data_size(slot-tts_tupleDescriptor, slot-tts_values, slot-tts_isnull); This patch does following changes, - This patch adds the parameter size to following functions (standard_)ExecutorRun / ExecutePlan / RunFromStore PortalRun / PortalRunSelect / PortalRunFetch / DoPortalRunFetch - The core is in StandardExecutorRun and RunFromStore. Simplly sum up the sent tuple length and compare against the given limit. - struct FetchStmt and EState has new member. - The modifications in gram.y affects on ecpg parser. I think I could fix them but with no confidence :( - Modified the corespondence parts of the changes above in auto_explain and pg_stat_statments only in parameter list. regards, -- Kyotaro Horiguchi NTT Open Source Software Center From 6f1dd6998ba312c3552f137365e3a3118b7935be Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi horiguchi.kyot...@lab.ntt.co.jp Date: Wed, 21 Jan 2015 17:18:09 +0900 Subject: [PATCH] Size limitation feature of FETCH v0 --- contrib/auto_explain/auto_explain.c | 8 +-- contrib/pg_stat_statements/pg_stat_statements.c | 8 +-- src/backend/commands/copy.c | 2 +- src/backend/commands/createas.c | 2 +- src/backend/commands/explain.c | 2 +- src/backend/commands/extension.c| 2 +- src/backend/commands/matview.c | 2 +- src/backend/commands/portalcmds.c | 3 +- src/backend/commands/prepare.c | 2 +- src/backend/executor/execMain.c | 35 +++-- src/backend/executor/execUtils.c| 1 + src/backend/executor/functions.c| 2 +- src/backend/executor/spi.c | 4 +- src/backend/parser/gram.y | 59 +++ src/backend/tcop/postgres.c | 2 + src/backend/tcop/pquery.c | 95 + src/include/executor/executor.h | 8 +-- src/include/nodes/execnodes.h | 1 + src/include/nodes/parsenodes.h | 1 + src/include/tcop/pquery.h | 3 +- src/interfaces/ecpg/preproc/Makefile| 2 +- src/interfaces/ecpg/preproc/ecpg.addons | 63 22 files changed, 248 insertions(+), 59 deletions(-) diff --git a/contrib/auto_explain/auto_explain.c b/contrib/auto_explain/auto_explain.c index 2a184ed..f121a33 100644 --- a/contrib/auto_explain/auto_explain.c +++ b/contrib/auto_explain/auto_explain.c @@ -57,7 +57,7 @@ void _PG_fini(void); static void explain_ExecutorStart(QueryDesc *queryDesc, int eflags); static void explain_ExecutorRun(QueryDesc *queryDesc, ScanDirection direction, - long count); + long count, long size); static void explain_ExecutorFinish(QueryDesc *queryDesc); static void explain_ExecutorEnd(QueryDesc *queryDesc); @@ -232,15 +232,15 @@ explain_ExecutorStart(QueryDesc *queryDesc, int eflags) * ExecutorRun hook: all we need do is track nesting depth */ static void -explain_ExecutorRun(QueryDesc *queryDesc, ScanDirection direction, long count) +explain_ExecutorRun(QueryDesc *queryDesc, ScanDirection direction, long count, long size) { nesting_level++; PG_TRY(); { if (prev_ExecutorRun) - prev_ExecutorRun(queryDesc, direction, count); + prev_ExecutorRun(queryDesc, direction, count, size); else - standard_ExecutorRun(queryDesc, direction, count); + standard_ExecutorRun(queryDesc,
Re: [HACKERS] PL/pgSQL, RAISE and error context
2015-01-22 12:37 GMT+01:00 Marko Tiikkaja ma...@joh.to: Hello, I just heard that there's going to be a fifth CF for 9.5 so I'm trying to gather all the patches I'd like to see in 9.5.. On 8/23/13 10:36 AM, I wrote: My opinion at this very moment is that we should leave the the DEFAULT verbosity alone and add a new one (call it COMPACT or such) with the suppressed context for non-ERRORs. I wonder if a better option would be to add a GUC to control this from the server side. plpgsql.suppress_simple_error_context or such, defaulting to false to maintain full backwards compatibility. That could be set to true for development installations and for client programs which only care about having all information available, rather than readability or aesthetics. Or is that a stupid idea? I just think hacking libpq for something like this is a huge overkill. I don't think so only plpgsql solution is satisfactory idea. There are some mix plpgsql / plperl ... application - and it isn't possible to remove error context from only one language. Regards Pavel .marko
Re: [HACKERS] collate test now failing
On Thu, Jan 22, 2015 at 11:53 AM, Robert Haas robertmh...@gmail.com wrote: On Thu, Jan 22, 2015 at 11:50 AM, Kevin Grittner kgri...@ymail.com wrote: Kevin Grittner kgri...@ymail.com wrote: I think this may have just started with: b529b65d1bf8537ca7fa024760a9782d7c8b66e5 Confirmed that I get a clean check on the prior commit. Can you check whether this fixes it? Kevin says (via IM) that it does, but with a compiler warning. Fixed that and pushed. Back to watching the buildfarm returns roll in... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Seq Scan
On Mon, Jan 19, 2015 at 6:50 PM, Robert Haas robertmh...@gmail.com wrote: On Mon, Jan 19, 2015 at 2:24 AM, Amit Kapila amit.kapil...@gmail.com wrote: Another thing is that I think prefetching is not supported on all platforms (Windows) and for such systems as per above algorithm we need to rely on block-by-block method. Well, I think we should try to set up a test to see if this is hurting us. First, do a sequential-scan of a related too big at least twice as large as RAM. Then, do a parallel sequential scan of the same relation with 2 workers. Repeat these in alternation several times. If the operating system is accomplishing meaningful readahead, and the parallel sequential scan is breaking it, then since the test is I/O-bound I would expect to see the parallel scan actually being slower than the normal way. I have taken some performance data as per above discussion. Basically, I have used parallel_count module which is part of parallel-mode patch as that seems to be more close to verify the I/O pattern (doesn't have any tuple communication overhead). Script used to test is attached (parallel_count.sh) Performance Data Configuration and Db Details IBM POWER-7 16 cores, 64 hardware threads RAM = 64GB Table Size - 120GB Used below statements to create table - create table tbl_perf(c1 int, c2 char(1000)); insert into tbl_perf values(generate_series(1,1000),'a'); insert into tbl_perf values(generate_series(1001,3000),'a'); insert into tbl_perf values(generate_series(3001,11000),'a'); *Block-By-Block* *No. of workers/Time (ms)* *0* *2* Run-1 267798 295051 Run-2 276646 296665 Run-3 281364 314952 Run-4 290231 326243 Run-5 288890 295684 Then I have modified the parallel_count module such that it can scan in fixed chunks, rather than block-by-block, the patch for same is attached (parallel_count_fixed_chunk_v1.patch, this is a patch based on parallel count module in parallel-mode patch [1]). *Fixed-Chunks* *No. of workers/Time (ms)* *0* *2* 286346 234037 250051 215111 255915 254934 263754 242228 251399 202581 Observations 1. Scanning block-by-block has negative impact on performance and I thin it will degrade more if we increase parallel count as that can lead to more randomness. 2. Scanning in fixed chunks improves the performance. Increasing parallel count to a very large number might impact the performance, but I think we can have a lower bound below which we will not allow multiple processes to scan the relation. Now I can go-ahead and try with prefetching approach as suggested by you, but I have a feeling that overall it might not be beneficial (mainly due to the reason that it is not supported on all platforms, we can say that we don't care for such platforms, but still there is no mitigation strategy for those platforms) due to the reasons mentioned up-thread. Thoughts? [1] http://www.postgresql.org/message-id/ca+tgmozduk4k3xhbxc9vm-82khourezdvqwtfglhwsd2r2a...@mail.gmail.com With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com parallel_count.sh Description: Bourne shell script parallel_count_fixed_chunk_v1.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PQputCopyEnd doesn't adhere to its API contract
On Fri, Jan 9, 2015 at 03:04:19PM -0500, Bruce Momjian wrote: Uh, where are we on this? I think someone needs to take Tom's proposed language and make it into a patch. And figure out which other functions in the documentation need similar updates. I have developed such a patch --- attached. I didn't see any other mentions of blocking in the docs. Patch applied. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] basebackups during ALTER DATABASE ... SET TABLESPACE ... not safe?
Hi, On 2015-01-20 16:28:19 +0100, Andres Freund wrote: I'm analyzing a problem in which a customer had a pg_basebackup (from standby) created 9.2 cluster that failed with WAL contains references to invalid pages. The failed record was a xlog redo visible i.e. XLOG_HEAP2_VISIBLE. First I thought there might be another bug along the line of 17fa4c321cc. Looking at the code and the WAL that didn't seem to be the case (man, I miss pg_xlogdump). Other, slightly older, standbys, didn't seem to have any problems. Logs show that a ALTER DATABASE ... SET TABLESPACE ... was running when the basebackup was started and finished *before* pg_basebackup finished. movedb() basically works in these steps: 1) lock out users of the database 2) RequestCheckpoint(IMMEDIATE|WAIT) 3) DropDatabaseBuffers() 4) copydir() 5) XLogInsert(XLOG_DBASE_CREATE) 6) RequestCheckpoint(CHECKPOINT_IMMEDIATE) 7) rmtree(src_dbpath) 8) XLogInsert(XLOG_DBASE_DROP) 9) unlock database If a basebackup starts while 4) is in progress and continues until 7) happens I think a pretty wide race opens: The basebackup can end up with a partial copy of the database in the old tablespace because the rmtree(old_path) concurrently was in progress. Normally such races are fixed during replay. But in this case, the replay of the XLOG_DBASE_CREATE will just try to do a rmtree(new); copydiar(old, new);. fixing nothing. Besides making AD .. ST use sane WAL logging, which doesn't seem backpatchable, I don't see what could be done against this except somehow making basebackups fail if a AD .. ST is in progress. Which doesn't look entirely trivial either. I basically have two ideas to fix this. 1) Make do_pg_start_backup() acquire a SHARE lock on pg_database. That'll prevent it from starting while a movedb() is still in progress. Then additionally add pg_backup_in_progress() function to xlog.c that checks (XLogCtl-Insert.exclusiveBackup || XLogCtl-Insert.nonExclusiveBackups != 0). Use that in createdb() and movedb() to error out if a backup is in progress. Not pretty, but sounds doable. We've discussed trying to sleep instead of erroring out in movedb(), while a base backup is in progress, but that's nontrivial. I also don't think ALTER DATABASE is ever intentionally run at the time of a base backup. 2) Make movedb() (and possibly created(), not sure yet) use proper WAL logging and log the whole copied data. I think this is the right long term fix and would end up being much more reliable. But it either requires some uglyness during redo (creating nonexistant database directories on the fly during redo) or new wal records. Doable, but probably too large/invasive to backpatch. Thanks for Alvaro and Petr for discussing the problem. I lean towards doing 1) in all branches and then doing 2) in master. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sequence Access Method WIP
On 01/12/2015 11:33 PM, Petr Jelinek wrote: Second patch adds DDL support. I originally wanted to make it CREATE/DROP SEQUENCE ACCESS METHOD... but that would mean making ACCESS a reserver keyword so I went for CREATE ACCESS METHOD FOR SEQUENCES which does not need to change anything (besides adding METHOD to unreserved keywords). The DDL support uses the DefineStmt infra with some very small change as the sequence ams are not schema qualified, but I think it's acceptable and saves considerable amount of boilerplate. Do we need DDL commands for this at all? I could go either way on that question. We recently had a discussion on that wrt. index access methods [1], and Tom opined that providing DDL for creating index access methods is not worth it. The extension can just insert the rows into pg_seqam with INSERT. Do we expect sequence access methods as extensions to be more popular than index access methods? Maybe, because the WAL-logging problem doesn't exist. But OTOH, if you're writing something like a replication system that needs global sequences as part of it, there aren't that many of those, and the installation scripts will need to deal with more complicated stuff than inserting a row in pg_seqam. [1] http://www.postgresql.org/message-id/26822.1414516...@sss.pgh.pa.us And third patch is gapless sequence implementation updated to work with the new DDL support with some tests added for checking if dependencies work correctly. It also acts as example on how to make custom AMs. I'll take a look at that to see how the API works, but we're not going to include it in the source tree, because it doesn't actually guarantee gaplessness. That makes it a pretty dangerous example. I'm sure we can come up with a better example that might even be useful. How about a Lamport's clock sequence, which advances once per second, in addition to when anyone calls nextval() ? Or a remote sequence that uses an FDW to call nextval() in a foreign server? - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sequence Access Method WIP
On 22/01/15 16:50, Heikki Linnakangas wrote: On 01/12/2015 11:33 PM, Petr Jelinek wrote: Second patch adds DDL support. I originally wanted to make it CREATE/DROP SEQUENCE ACCESS METHOD... but that would mean making ACCESS a reserver keyword so I went for CREATE ACCESS METHOD FOR SEQUENCES which does not need to change anything (besides adding METHOD to unreserved keywords). The DDL support uses the DefineStmt infra with some very small change as the sequence ams are not schema qualified, but I think it's acceptable and saves considerable amount of boilerplate. Do we need DDL commands for this at all? I could go either way on that question. We recently had a discussion on that wrt. index access methods [1], and Tom opined that providing DDL for creating index access methods is not worth it. The extension can just insert the rows into pg_seqam with INSERT. Do we expect sequence access methods as extensions to be more popular than index access methods? Maybe, because the WAL-logging problem doesn't exist. But OTOH, if you're writing something like a replication system that needs global sequences as part of it, there aren't that many of those, and the installation scripts will need to deal with more complicated stuff than inserting a row in pg_seqam. I don't know about popularity, and I've seen the discussion about indexes. The motivation for DDL for me was handling dependencies correctly, that's all. If we say we don't care about that (and allow DROP EXTENSION even though user has sequences that are using the AM) then we don't need DDL. [1] http://www.postgresql.org/message-id/26822.1414516...@sss.pgh.pa.us And third patch is gapless sequence implementation updated to work with the new DDL support with some tests added for checking if dependencies work correctly. It also acts as example on how to make custom AMs. I'll take a look at that to see how the API works, but we're not going to include it in the source tree, because it doesn't actually guarantee gaplessness. That makes it a pretty dangerous example. I'm sure we can come up with a better example that might even be useful. How about a Lamport's clock sequence, which advances once per second, in addition to when anyone calls nextval() ? Or a remote sequence that uses an FDW to call nextval() in a foreign server? I have updated patch ready, just didn't submit it because I am otherwise busy this week, I hope to get to it today evening or tomorrow morning, so I'd wait until that with looking at the patch. The new version (the one that is not submitted yet) of gapless sequence is way more ugly and probably not best example either but does guarantee gaplessness (it stores the last value in it's own value table). So I am not sure if it should be included either... -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [POC] FETCH limited by bytes.
Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp writes: Hello, as the discuttion on async fetching on postgres_fdw, FETCH with data-size limitation would be useful to get memory usage stability of postgres_fdw. Is such a feature and syntax could be allowed to be added? This seems like a lot of work, and frankly an incredibly ugly API, for a benefit that is entirely hypothetical. Have you got numbers showing any actual performance win for postgres_fdw? Even if we wanted to do something like this, I strongly object to measuring size by heap_compute_data_size. That's not a number that users would normally have any direct knowledge of; nor does it have anything at all to do with the claimed use-case, where what you'd really need to measure is bytes transmitted down the wire. (The difference is not small: for instance, toasted values would likely still be toasted at the point where you're measuring.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Windows buildfarm animals are still not happy with abbreviated keys patch
On Thu, Jan 22, 2015 at 10:57 AM, Robert Haas robertmh...@gmail.com wrote: On Wed, Jan 21, 2015 at 2:30 PM, Peter Geoghegan p...@heroku.com wrote: Even following Robert's disabling of abbreviated keys on Windows, buildfarm animals hamerkop, brolga, currawong and bowerbird remain unhappy, with failing regression tests for collate and sometimes (but not always) aggregates. Some of these only use the C locale. I think that aggregates does not fail for brolga because it's built without locale_t support (not sure how to interpret MSVC configure output, but I think that the other animals do have such support). So maybe this code being executed within btsortsupport_worker(), for the C locale on Windows is at issue (note that abbreviation is still disabled): tss-locale = pg_newlocale_from_collation(collid); Yes, you seem to have rather unwisely changed around the order of the checks in btsortsupport_worker(). I've just rewritten it heavily to try to more clearly separate the decision about whether to do fmgr-elision, and if so which comparator to use, from the decision about whether to use abbreviation. Naturally I can't promise that this is completely right, but I hope that, if it's not, it will at least be easier to fix. There's no sanity in removing the lc_collate_is_c() check from the top of the function and then trying to remember to exclude that case individually from the multiple places further down that count on the fact that they'll never be reached in that case. This function may even have a third decision to make someday, and they can't all be intertwined. This seems to have broken more stuff. My working hypothesis is that the culprit is here: /* * There is no special handling of the C locale here, unlike with * varstr_cmp(). strxfrm() is used indifferently. */ As far as I can see, that's just hoping that when the locale is C, strxfrm() is memcpy(). If it isn't, then you will potentially get different results when the abbreviated keys stuff is used vs. when it isn't, because when it isn't, we're going to memcmp(), and when it is, we're going to memcmp() the results of strxfrm(). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Windows buildfarm animals are still not happy with abbreviated keys patch
On Wed, Jan 21, 2015 at 2:30 PM, Peter Geoghegan p...@heroku.com wrote: Even following Robert's disabling of abbreviated keys on Windows, buildfarm animals hamerkop, brolga, currawong and bowerbird remain unhappy, with failing regression tests for collate and sometimes (but not always) aggregates. Some of these only use the C locale. I think that aggregates does not fail for brolga because it's built without locale_t support (not sure how to interpret MSVC configure output, but I think that the other animals do have such support). So maybe this code being executed within btsortsupport_worker(), for the C locale on Windows is at issue (note that abbreviation is still disabled): tss-locale = pg_newlocale_from_collation(collid); Yes, you seem to have rather unwisely changed around the order of the checks in btsortsupport_worker(). I've just rewritten it heavily to try to more clearly separate the decision about whether to do fmgr-elision, and if so which comparator to use, from the decision about whether to use abbreviation. Naturally I can't promise that this is completely right, but I hope that, if it's not, it will at least be easier to fix. There's no sanity in removing the lc_collate_is_c() check from the top of the function and then trying to remember to exclude that case individually from the multiple places further down that count on the fact that they'll never be reached in that case. This function may even have a third decision to make someday, and they can't all be intertwined. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] fix typo in reinit.h
Sawada Masahiko wrote: Hi, Attached patch fixes a typo in init.h. Thanks, pushed. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Sequence Access Method WIP
On Thu, Jan 22, 2015 at 10:50 AM, Heikki Linnakangas hlinnakan...@vmware.com wrote: On 01/12/2015 11:33 PM, Petr Jelinek wrote: Second patch adds DDL support. I originally wanted to make it CREATE/DROP SEQUENCE ACCESS METHOD... but that would mean making ACCESS a reserver keyword so I went for CREATE ACCESS METHOD FOR SEQUENCES which does not need to change anything (besides adding METHOD to unreserved keywords). The DDL support uses the DefineStmt infra with some very small change as the sequence ams are not schema qualified, but I think it's acceptable and saves considerable amount of boilerplate. Do we need DDL commands for this at all? I could go either way on that question. We recently had a discussion on that wrt. index access methods [1], and Tom opined that providing DDL for creating index access methods is not worth it. The extension can just insert the rows into pg_seqam with INSERT. Do we expect sequence access methods as extensions to be more popular than index access methods? Maybe, because the WAL-logging problem doesn't exist. But OTOH, if you're writing something like a replication system that needs global sequences as part of it, there aren't that many of those, and the installation scripts will need to deal with more complicated stuff than inserting a row in pg_seqam. I think the main reason we don't need DDL for pg_am is because there's no real hope of anybody actually inserting anything in there whether we have a command for it or not. If we actually expect this to be used, I think it should probably have some kind of DDL, because otherwise what will pg_dump emit? Nothing is bad because then dumps won't restore, and direct inserts to pg_seqam doesn't seem great either. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: decreasing memory needlessly consumed by array_agg
Hi, On 21.1.2015 09:01, Jeff Davis wrote: On Tue, 2015-01-20 at 23:37 +0100, Tomas Vondra wrote: Tom's message where he points that out is here: http://www.postgresql.org/message-id/20707.1396372...@sss.pgh.pa.us That message also says: I think a patch that stood a chance of getting committed would need to detect whether the aggregate was being called in simple or grouped contexts, and apply different behaviors in the two cases. I take that as an objection to any patch which does not distinguish between the grouped and ungrouped aggregate cases, which includes your patch. I don't think 'simple context' in this case means 'aggregate without a group by clause'. The way I understood it back in April 2014 was that while the patch worked fine with grouped cases (i.e. in nodeAgg.c or such), the underlying function are used elsewhere in a simple context (e.g. in an xpath() or so) - and in this case it was broken. And that was a correct point, and was fixed by the recent patches. But maybe I'm missing something? I don't agree with that objection (or perhaps I don't understand it); but given the strong words above, I need to get some kind of response before I can consider committing your patch. I generally agree that having two API 'facets' with different behavior is slightly awkward and assymetric, but I wouldn't call that ugly. Right, your words are more precise (and polite). My apologies. U ... I wasn't suggesting calling the resulting API 'ugly' is impolite or so. It was meant just as a comment that the aesthetics of the API is quite subjective matter. No apology needed. I actually modified both APIs initially, but I think Ali is right that not breaking the existing API (and keeping the original behavior in that case) is better. We can break it any time we want in the future, but it's impossible to unbreak it ;-) We can't break the old API, and I'm not suggesting that we do. I was hoping to find some alternative. Why can't we? I'm not saying we should in this cases, but there are cases when breaking the API is the right thing to do (e.g. when the behavior changes radically, and needs to be noticed by the users). -- Tomas Vondrahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: knowing detail of config files via SQL
Sawada Masahiko sawada.m...@gmail.com writes: As per discussion http://www.postgresql.org/message-id/cad21aodkds8oqbr199wwrcp7fidvow6bbb+cgdwqhuf+gx_...@mail.gmail.com, I would like to proposal new view like pg_file_settings to know detail of config file via SQL. - Background In 9.4 postgresql.auto.conf is added to support ALTER SYSTEM command and that config file is loaded after whenever postgresql.conf is loaded. That is, postgresql.auto.conf is quite high priority so that the value in postgresql.conf can not work at all if DBA set it manually. ALTER SYSTEM RESET command can remove the unnecessary value in postgresql.auto.conf but there are no way to know about where the value has came from. (They can only give the information about the setting in last file it is present.) - Solution The patch not is implemented yet, just proposing now. I'm imaging that we can have new pg_file_settings view has following column to store current assigned value in config file. - guc value name - guc value - config file path (e.g. /opt/data/postgresql.sql, /opt/data/postgresql.auto.conf, /opt/hoge.conf) This view could be convenient for DBA to decide if the postgresql.auto.conf is useful or not and if it's not useful then DBA could use ALTER SYSTEM .. RESET command to remove the same from postgresql.auto.conf. Also other idea is to add additional columns existing view (pg_settings), according to prev discussion. Please give me comments. I still don't understand what problem you think needs to be solved. It's already perfectly clear from the pg_settings view when a value came from postgresql.auto.conf. For instance: regression=# select name,setting,source,sourcefile,sourceline from pg_settings where name = 'TimeZone'; name | setting | source | sourcefile | sourceline --+++-+ TimeZone | US/Eastern | configuration file | /home/postgres/testversion/data/postgresql.conf |531 (1 row) regression=# alter system set timezone = 'Asia/Shanghai'; ALTER SYSTEM regression=# select pg_reload_conf(); pg_reload_conf t (1 row) regression=# select name,setting,source,sourcefile,sourceline from pg_settings where name = 'TimeZone'; name |setting| source | sourcefile | sourceline --+---++--+ TimeZone | Asia/Shanghai | configuration file | /home/postgres/testversion/data/postgresql.auto.conf | 3 (1 row) regression=# alter system reset timezone; ALTER SYSTEM regression=# select pg_reload_conf(); pg_reload_conf t (1 row) regression=# select name,setting,source,sourcefile,sourceline from pg_settings where name = 'TimeZone'; name | setting | source | sourcefile | sourceline --+++-+ TimeZone | US/Eastern | configuration file | /home/postgres/testversion/data/postgresql.conf |531 (1 row) What else is needed? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: Reducing lock strength of trigger and foreign key DDL
On 01/20/2015 10:08 AM, Noah Misch wrote: Fair enough. It did reinforce pg_get_constraintdef() as a subroutine of pg_dump rather than an independent, rigorous interface. It perhaps made the function worse for non-pg_dump callers. In that vein, each one of these hacks has a cost. One could make a reasonable argument that ALTER TRIGGER RENAME locking is not important enough to justify spreading the hack from pg_get_constraintdef() to pg_get_triggerdef(). Lowering the CREATE TRIGGER lock level does not require any ruleutils.c change for the benefit of pg_dump, because pg_dump won't see the pg_trigger row of a too-recent trigger. I agree with this view, and am not sure myself that it is worth lowering the lock level of ALTER TRIGGER RENAME. I have attached a patch without the changes to ALTER TRIGGER and ruleutils.c and also fixes the comment issues noted by Andres. -- Andreas Karlsson diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml index a0d6867..fc86224 100644 --- a/doc/src/sgml/mvcc.sgml +++ b/doc/src/sgml/mvcc.sgml @@ -908,9 +908,9 @@ ERROR: could not serialize access due to read/write dependencies among transact /para para - This lock mode is not automatically acquired by any - productnamePostgreSQL/productname command. -/para + Acquired by commandCREATE TRIGGER/command and many forms of + commandALTER TABLE/command (see xref linkend=SQL-ALTERTABLE). +/para /listitem /varlistentry @@ -957,9 +957,9 @@ ERROR: could not serialize access due to read/write dependencies among transact commandTRUNCATE/command, commandREINDEX/command, commandCLUSTER/command, and commandVACUUM FULL/command commands. Many forms of commandALTER TABLE/ also acquire - a lock at this level (see xref linkend=SQL-ALTERTABLE). - This is also the default lock mode for commandLOCK TABLE/command - statements that do not specify a mode explicitly. + a lock at this level. This is also the default lock mode for + commandLOCK TABLE/command statements that do not specify + a mode explicitly. /para /listitem /varlistentry diff --git a/doc/src/sgml/ref/alter_table.sgml b/doc/src/sgml/ref/alter_table.sgml index b3a4970..f5bbfcd 100644 --- a/doc/src/sgml/ref/alter_table.sgml +++ b/doc/src/sgml/ref/alter_table.sgml @@ -406,6 +406,9 @@ ALTER TABLE ALL IN TABLESPACE replaceable class=PARAMETERname/replaceable mode, and triggers configured as literalENABLE ALWAYS/literal will fire regardless of the current replication mode. /para + para + This command acquires a literalSHARE ROW EXCLUSIVE/literal lock. + /para /listitem /varlistentry diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c index 66d5083..08aa71b 100644 --- a/src/backend/commands/tablecmds.c +++ b/src/backend/commands/tablecmds.c @@ -2858,13 +2858,8 @@ AlterTableGetLockLevel(List *cmds) break; /* - * These subcommands affect write operations only. XXX - * Theoretically, these could be ShareRowExclusiveLock. + * These subcommands affect write operations only. */ - case AT_ColumnDefault: - case AT_ProcessedConstraint: /* becomes AT_AddConstraint */ - case AT_AddConstraintRecurse: /* becomes AT_AddConstraint */ - case AT_ReAddConstraint: /* becomes AT_AddConstraint */ case AT_EnableTrig: case AT_EnableAlwaysTrig: case AT_EnableReplicaTrig: @@ -2873,6 +2868,17 @@ AlterTableGetLockLevel(List *cmds) case AT_DisableTrig: case AT_DisableTrigAll: case AT_DisableTrigUser: +cmd_lockmode = ShareRowExclusiveLock; +break; + +/* + * These subcommands affect write operations only. XXX + * Theoretically, these could be ShareRowExclusiveLock. + */ + case AT_ColumnDefault: + case AT_ProcessedConstraint: /* becomes AT_AddConstraint */ + case AT_AddConstraintRecurse: /* becomes AT_AddConstraint */ + case AT_ReAddConstraint: /* becomes AT_AddConstraint */ case AT_AlterConstraint: case AT_AddIndex: /* from ADD CONSTRAINT */ case AT_AddIndexConstraint: @@ -2909,11 +2915,9 @@ AlterTableGetLockLevel(List *cmds) /* * We add triggers to both tables when we add a * Foreign Key, so the lock level must be at least - * as strong as CREATE TRIGGER. XXX Might be set - * down to ShareRowExclusiveLock though trigger - * info is accessed by pg_get_triggerdef + * as strong as CREATE TRIGGER. */ - cmd_lockmode = AccessExclusiveLock; + cmd_lockmode = ShareRowExclusiveLock; break; default: @@ -6030,16 +6034,13 @@ ATAddForeignKeyConstraint(AlteredTableInfo *tab, Relation rel, ListCell *old_pfeqop_item = list_head(fkconstraint-old_conpfeqop); /* - * Grab an exclusive lock on the pk table, so that someone doesn't delete - * rows out from under
Re: [HACKERS] basebackups during ALTER DATABASE ... SET TABLESPACE ... not safe?
Andres, * Andres Freund (and...@2ndquadrant.com) wrote: 1) Make do_pg_start_backup() acquire a SHARE lock on pg_database. That'll prevent it from starting while a movedb() is still in progress. Then additionally add pg_backup_in_progress() function to xlog.c that checks (XLogCtl-Insert.exclusiveBackup || XLogCtl-Insert.nonExclusiveBackups != 0). Use that in createdb() and movedb() to error out if a backup is in progress. Not pretty, but sounds doable. We've discussed trying to sleep instead of erroring out in movedb(), while a base backup is in progress, but that's nontrivial. I also don't think ALTER DATABASE is ever intentionally run at the time of a base backup. 2) Make movedb() (and possibly created(), not sure yet) use proper WAL logging and log the whole copied data. I think this is the right long term fix and would end up being much more reliable. But it either requires some uglyness during redo (creating nonexistant database directories on the fly during redo) or new wal records. Doable, but probably too large/invasive to backpatch. Thanks for Alvaro and Petr for discussing the problem. I lean towards doing 1) in all branches and then doing 2) in master. I'm trying to figure out why you'd do '2' in master when in discussion of '1' you say I also don't think ALTER DATABASE is even intentionally run at the time of a base backup. I agree with that sentiment and am inclined to say that '1' is good enough throughout. Another thought would be to offer both- perhaps an AD .. ST .. CONCURRENTLY option which would WAL. Or a way to say my backup is more important than some AD .. ST .., which I could see some admins wanting. The other question is- what about AT .. ST? That is, ALTER TABLE .. SET TABLESPACE. Do we do things differently there or is there a similar issue? Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] alter user/role CURRENT_USER
Looking at this a bit, I'm not sure completely replacing RoleId with a node is the best idea; some of the users of that production in the grammar are okay with accepting only normal strings as names, and don't need all the CURRENT_USER etc stuff. Maybe we need a new production, say RoleSpec, and we modify the few productions that need the extra flexibility? For instance we could have ALTER USER RoleSpec instead of ALTER USER RoleId. But we would keep CREATE USER RoleId, because it doesn't make any sense to accept CREATE USER CURRENT_USER. I think that would lead to a less invasive patch also. Am I making sense? -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: knowing detail of config files via SQL
Tom Lane-2 wrote regression=# alter system reset timezone; ALTER SYSTEM regression=# select pg_reload_conf(); How does someone know that performing the above commands will result in the TimeZone setting being changed from Asia/Shanghai to US/Eastern? David J. -- View this message in context: http://postgresql.nabble.com/Proposal-knowing-detail-of-config-files-via-SQL-tp5835075p5835142.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] hung backends stuck in spinlock heavy endless loop
On Fri, Jan 16, 2015 at 5:20 PM, Peter Geoghegan p...@heroku.com wrote: On Fri, Jan 16, 2015 at 10:33 AM, Merlin Moncure mmonc...@gmail.com wrote: ISTM the next step is to bisect the problem down over the weekend in order to to narrow the search. If that doesn't turn up anything productive I'll look into taking other steps. That might be the quickest way to do it, provided you can isolate the bug fairly reliably. It might be a bit tricky to write a shell script that assumes a certain amount of time having passed without the bug tripping indicates that it doesn't exist, and have that work consistently. I'm slightly concerned that you'll hit other bugs that have since been fixed, given the large number of possible symptoms here. Quick update: not done yet, but I'm making consistent progress, with several false starts. (for example, I had a .conf problem with the new dynamic shared memory setting and git merrily bisected down to the introduction of the feature.). I have to triple check everything :(. The problem is generally reproducible but I get false negatives that throws off the bisection. I estimate that early next week I'll have it narrowed down significantly if not to the exact offending revision. So far, the 'nasty' damage seems to generally if not always follow a checksum failure and the checksum failures are always numerically adjacent. For example: [cds2 12707 2015-01-22 12:51:11.032 CST 2754]WARNING: page verification failed, calculated checksum 9465 but expected 9477 at character 20 [cds2 21202 2015-01-22 13:10:18.172 CST 3196]WARNING: page verification failed, calculated checksum 61889 but expected 61903 at character 20 [cds2 29153 2015-01-22 14:49:04.831 CST 4803]WARNING: page verification failed, calculated checksum 27311 but expected 27316 I'm not up on the intricacies of our checksum algorithm but this is making me suspicious that we are looking at a improperly flipped visibility bit via some obscure problem -- almost certainly with vacuum playing a role. This fits the profile of catastrophic damage that masquerades as numerous other problems. Or, perhaps, something is flipping what it thinks is a visibility bit but on the wrong page. I still haven't categorically ruled out pl/sh yet; that's something to keep in mind. In the 'plus' category, aside from flushing out this issue, I've had zero runtime problems so far aside from the mains problem; bisection (at least on the 'bad' side) has been reliably engaged by simply counting the number of warnings/errors/etc in the log. That's really impressive. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: knowing detail of config files via SQL
On Thu, Jan 22, 2015 at 3:04 PM, Tom Lane t...@sss.pgh.pa.us wrote: David G Johnston david.g.johns...@gmail.com writes: Tom Lane-2 wrote regression=# alter system reset timezone; ALTER SYSTEM regression=# select pg_reload_conf(); How does someone know that performing the above commands will result in the TimeZone setting being changed from Asia/Shanghai to US/Eastern? Is that a requirement, and if so why? Because this proposal doesn't guarantee any such knowledge AFAICS. The proposal provides for SQL access to all possible sources of variable value setting and, ideally, a means of ordering them in priority order, so that a search for TimeZone would return two records, one for postgresql.auto.conf and one for postgresql.conf - which are numbered 1 and 2 respectively - so that in looking at that result if the postgresql.auto.conf entry were to be removed the user would know that what the value is in postgresql.conf that would become active. Furthermore, if postgresql.conf has a setting AND there is a mapping in an #included file that information would be accessible via SQL as well. David J.
Re: [HACKERS] Proposal: knowing detail of config files via SQL
David G Johnston david.g.johns...@gmail.com writes: Tom Lane-2 wrote regression=# alter system reset timezone; ALTER SYSTEM regression=# select pg_reload_conf(); How does someone know that performing the above commands will result in the TimeZone setting being changed from Asia/Shanghai to US/Eastern? Is that a requirement, and if so why? Because this proposal doesn't guarantee any such knowledge AFAICS. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: knowing detail of config files via SQL
David Johnston david.g.johns...@gmail.com writes: On Thu, Jan 22, 2015 at 3:04 PM, Tom Lane t...@sss.pgh.pa.us wrote: Is that a requirement, and if so why? Because this proposal doesn't guarantee any such knowledge AFAICS. âThe proposal provides for SQL access to all possible sources of variable value setting and, ideally, a means of ordering them in priority order, so that a search for TimeZone would return two records, one for postgresql.auto.conf and one for postgresql.conf - which are numbered 1 and 2 respectively - so that in looking at that result if the postgresql.auto.conf entry were to be removed the user would know that what the value is in postgresql.conf that would become active. Furthermore, if postgresql.conf has a setting AND there is a mapping in an #included file that information would be accessible via SQL as well. I know what the proposal is. What I am questioning is the use-case that justifies having us build and support all this extra mechanism. How often does anyone need to know what the next down variable value would be? And if they do need to know whether a variable is set in postgresql.conf, how often wouldn't they just resort to grep instead? (Among other points, grep would succeed at noticing commented-out entries, which this mechanism would not.) GUC has existed in more or less its current state for about 15 years, and I don't recall a lot of complaints that would be solved by this. Furthermore, given that ALTER SYSTEM was sold to us as more or less obsoleting manual editing of postgresql.conf, I rather doubt that it's changed the basis of discussion all that much. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] basebackups during ALTER DATABASE ... SET TABLESPACE ... not safe?
On 2015-01-22 16:38:49 -0500, Stephen Frost wrote: Andres, * Andres Freund (and...@2ndquadrant.com) wrote: 1) Make do_pg_start_backup() acquire a SHARE lock on pg_database. That'll prevent it from starting while a movedb() is still in progress. Then additionally add pg_backup_in_progress() function to xlog.c that checks (XLogCtl-Insert.exclusiveBackup || XLogCtl-Insert.nonExclusiveBackups != 0). Use that in createdb() and movedb() to error out if a backup is in progress. Not pretty, but sounds doable. We've discussed trying to sleep instead of erroring out in movedb(), while a base backup is in progress, but that's nontrivial. I also don't think ALTER DATABASE is ever intentionally run at the time of a base backup. 2) Make movedb() (and possibly created(), not sure yet) use proper WAL logging and log the whole copied data. I think this is the right long term fix and would end up being much more reliable. But it either requires some uglyness during redo (creating nonexistant database directories on the fly during redo) or new wal records. Doable, but probably too large/invasive to backpatch. Thanks for Alvaro and Petr for discussing the problem. I lean towards doing 1) in all branches and then doing 2) in master. I'm trying to figure out why you'd do '2' in master when in discussion of '1' you say I also don't think ALTER DATABASE is even intentionally run at the time of a base backup. I agree with that sentiment and am inclined to say that '1' is good enough throughout. Because the way it currently works is a major crock. It's more luck than anything that it actually somewhat works. We normally rely on WAL to bring us into a consistent state. But around CREATE/MOVE/DROP DATABASE we've ignored that. My point about not intentionally running it at the same isn't that it's not happening, it's that it's not intended to happen at the same time. But most sane sites these days actually do use automated basebackups - making it much harder to be safe against this. And. Hm. The difficulty of the current method is evidenced by the fact that so far nodoby recognized that 1) as described above isn't actually safe. It fails to protect against basebackups on a standby as its XLogCtl state will obviously not be visible on the master. For exclusive basebackups (i.e. SELECT pg_start/stop_backup()) we can't rely on locking because these commands can happen in independent sessions. But I think we can in the builtin nonexclusive basebackups, as it's run in one session. So I guess we could rely on recovery conflicts not allowing the XLOG_DBASE_CREATE/DROP to replicate. It's nasty that a basebackup can suddenly stop replication from progressing though :(. Additionally it'd not protect stuff like pgespresso (an extension for nonexclusive standby basebackups). The other question is- what about AT .. ST? That is, ALTER TABLE .. SET TABLESPACE. Do we do things differently there or is there a similar issue? No issue, because it actually is implemented halfway sanely sanely and uses WAL logging. I personally don't like at all that it uses FlushRelationBuffers() and reads the tables on the smgr level instead of using the buffer manager and a bulk state, but ... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: knowing detail of config files via SQL
On Thu, Jan 22, 2015 at 3:19 PM, Tom Lane t...@sss.pgh.pa.us wrote: David Johnston david.g.johns...@gmail.com writes: On Thu, Jan 22, 2015 at 3:04 PM, Tom Lane t...@sss.pgh.pa.us wrote: Is that a requirement, and if so why? Because this proposal doesn't guarantee any such knowledge AFAICS. The proposal provides for SQL access to all possible sources of variable value setting and, ideally, a means of ordering them in priority order, so that a search for TimeZone would return two records, one for postgresql.auto.conf and one for postgresql.conf - which are numbered 1 and 2 respectively - so that in looking at that result if the postgresql.auto.conf entry were to be removed the user would know that what the value is in postgresql.conf that would become active. Furthermore, if postgresql.conf has a setting AND there is a mapping in an #included file that information would be accessible via SQL as well. I know what the proposal is. What I am questioning is the use-case that justifies having us build and support all this extra mechanism. How often does anyone need to know what the next down variable value would be? And if they do need to know whether a variable is set in postgresql.conf, how often wouldn't they just resort to grep instead? (Among other points, grep would succeed at noticing commented-out entries, which this mechanism would not.) GUC has existed in more or less its current state for about 15 years, and I don't recall a lot of complaints that would be solved by this. Furthermore, given that ALTER SYSTEM was sold to us as more or less obsoleting manual editing of postgresql.conf, I rather doubt that it's changed the basis of discussion all that much. i doubt we'd actually see many complaints since, like you say, people are likely to just use a different technique. The only thing PostgreSQL itself needs to provide is a master inventory of seen/known settings; the user interface can be left up to other layers (psql, pgadmin, extensions, etc...). It falls into making the system more user friendly. But maybe the answer for those users is that if you setup is so complex this would be helpful you probably need to rethink your setup. Then again, if I only interact with the via SQL at least can issue RESET and know the end result - after a config reload - without having to log into the server and grep the config file - or know what the system defaults are for settings that do not appear explicitly in postgresql.conf; all without having to refer to documentation as well. I'm doubtful it is going to interest any of the core hackers to put this in place but it at least warrants a couple of passes of brain-storming which can then be memorialized on the Wiki-ToDo. David J.
Re: [HACKERS] New CF app: changing email sender
On Tue, Jan 20, 2015 at 10:58 AM, Andrew Gierth and...@tao11.riddles.org.uk wrote: Kyotaro == Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp writes: Kyotaro Hmm. The mail address indeed *was* mine but is now obsolete, Kyotaro so that the email might bounce. But I haven't find how to Kyotaro change it within the app itself, and the PostgreSQL community Kyotaro account page. Just being able to change the email address on the community account isn't enough; I for one am subscribed to the lists with a different email address than the one associated with my community account (for spam management reasons). There needs to be a way to have multiple addresses or to specify which is to be used for the post. This should also be fixed now. You can click Edit Profile, and add one or more extra email addresses there (after email verification of course), and then pick one of those to use for sending email. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/
Re: [HACKERS] delta relations in AFTER triggers
Robert Haas robertmh...@gmail.com wrote: Another idea is to change what actually gets passed to the parser callback. Right now we just pass the PLpgSQL_expr. If we created a new structure that contained that plus the PLpgSQL_execstate, we'd be in fine shape. But this sort of gets at the root of the problem here: with variables, the parser callback doesn't return the actual *value*, it returns a Param node that will later, at execution time, be looked up to find the actual value. With relations, we're sort of doing the same thing - the tuplestore RTE doesn't need to contain the actual data, just the tuple descriptor. But the data structures from which we can get that information also contain the actual execution-time information, so passing them into parse-time callbacks seems like it might be, if nothing else, a recipe for future bugs. That was, of course, why this patch evolved to using this structure during parsing: typedef struct TsrmdData { char *name; /* name used to identify the tuplestore */ TupleDesc tupdesc;/* description of result rows */ } TsrmdData; typedef TsrmdData *Tsrmd; ... and this during execution: typedef struct TsrData { TsrmdData md; Tuplestorestate*tstate; /* data (or tids) */ } TsrData; typedef TsrData *Tsr; The big problem, as I see it, is how to deliver these to where they are needed. I didn't think it was that hard to do with the parser hook; it's what to do to get the execution time structure to where it's needed that I can't figure out. Passing it with the parameters is tricky because we're often passing a NULL for the reference to the parameter list when we need these. Trying to coax the executor to pass in a parameter list when there are no parameters, just these ephemeral relations, seems very tricky and all solutions I have tried (other than the one Heikki and others have objected to) very fragile. In short, the only solution which I've been able to come up with that works (and seems to me solid enough to commit) is the one that Hekki, Tom, and Robert seem to think should be made more like parameter handling; and every attempt at handling these relations like parameters seems to me too fragile for me to feel it is worthy of commit. We're really down to the wire on getting this feature into 9.5; and we're way past what was initially my goal, which was to build on this to get some incremental maintenance of (some types of simple) materialized views into 9.5. IMO we need to be able to build up tuplestores and easily reference them from within complex queries to create any sane and efficient implementation of incremental maintenance of materialized views. This patch was mainly intended to make progress on the MV front, with an incidental benefit of providing a standard feature that I have seen requested a few times. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] fix typo in reinit.h
Hi, Attached patch fixes a typo in init.h. Regards, --- Sawada Masahiko fix_typo_reinit_h.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Windows buildfarm animals are still not happy with abbreviated keys patch
On Thu, Jan 22, 2015 at 12:34 PM, Robert Haas robertmh...@gmail.com wrote: This isn't really Windows-specific. The root of the problem is that when LC_COLLATE=C you were trying to use strxfrm() for the abbreviated key even though memcmp() is the authoritative comparator in that case. Exactly which platforms happened to blow up as a result of that is kind of beside the point. I don't see how that could be a problem. Even if the strxfrm() blob wasn't identical to the original string (I think it should always be, accept maybe on MacOSX), it's still reasonable to assume that there is equivalent behavior across the C locale and locales with collations like en_US.UTF-8. It wasn't as if I was using strxfrm() within bttextfastcmp_c() -- just within bttext_abbrev_convert(), to form an abbreviated key for text that uses the C locale. The C locale is just another locale - AFAICT, the only reason we have treated it differently in PostgreSQL is because that tended to avoid needing to copy and NUL-terminate, which strcoll() requires. It might be that that doesn't work out for some reason (but not because strxfrm() will not have behavior equivalent to memcpy() with the C locale -- I was *not* relying on that). That having been said, it's clearer to continue to handle each case (C locale vs other locales) separately within the new bttext_abbrev_convert() function, just to be consistent, but also to avoid NUL-terminating the text strings to pass to strxfrm(), which as you point out is an avoidable cost. So, I'm not asking you to restore the terser uniform use of strxfrm() we had before, but, for what it's worth, that was not the real issue. The real issue was that strxfrm() spuriously used the wrong locale, as you said. Provided strxfrm() had the correct locale (the C locale), then AFAICT it ought to have been fine, regardless of whether or not it then behave exactly equivalently to memcpy(). -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Using 128-bit integers for sum, avg and statistics aggregates
A new version of the patch is attached, which changes the following: - Changed from using __int128_t to __int128. - Actually compiles and runs code in configure to see that int128 works. - No longer tests for __uint128_t. - Updated pg_config.h.win32 - Renamed some things from int12 to int128, there are still some places with int16 which I am not sure what to do with. A remaining issue is what estimate we should pick for the size of the aggregate state. This patch uses the size of the int128 version for the estimate, but I have no strong opinions on which we should use. -- Andreas Karlsson diff --git a/configure b/configure index 8490eb7..d9b0e8f 100755 --- a/configure +++ b/configure @@ -13743,6 +13743,49 @@ _ACEOF fi +{ $as_echo $as_me:${as_lineno-$LINENO}: checking __int128 5 +$as_echo_n checking __int128... 6; } +if test $cross_compiling = yes; then : + have___int128=no +else + cat confdefs.h - _ACEOF conftest.$ac_ext +/* end confdefs.h. */ + +__int128 a = 2001; +__int128 b = 4005; + +int does_int128_work() +{ + __int128 c,d; + + c = a * b; + d = (c + b) / b; + if (d != a+1) +return 0; + return 1; +} +main() { + exit(! does_int128_work()); +} + +_ACEOF +if ac_fn_c_try_run $LINENO; then : + have___int128=yes +else + have___int128=no +fi +rm -f core *.core core.conftest.* gmon.out bb.out conftest$ac_exeext \ + conftest.$ac_objext conftest.beam conftest.$ac_ext +fi + +{ $as_echo $as_me:${as_lineno-$LINENO}: result: $have___int128 5 +$as_echo $have___int128 6; } +if test x$have___int128 = xyes ; then + +$as_echo #define HAVE___INT128 1 confdefs.h + +fi + # We also check for sig_atomic_t, which *should* be defined per ANSI # C, but is missing on some old platforms. ac_fn_c_check_type $LINENO sig_atomic_t ac_cv_type_sig_atomic_t #include signal.h diff --git a/configure.in b/configure.in index b4bd09e..189e2f0 100644 --- a/configure.in +++ b/configure.in @@ -1760,6 +1760,30 @@ AC_DEFINE_UNQUOTED(MAXIMUM_ALIGNOF, $MAX_ALIGNOF, [Define as the maximum alignme AC_CHECK_TYPES([int8, uint8, int64, uint64], [], [], [#include stdio.h]) +AC_MSG_CHECKING([__int128]) +AC_TRY_RUN([ +__int128 a = 2001; +__int128 b = 4005; + +int does_int128_work() +{ + __int128 c,d; + + c = a * b; + d = (c + b) / b; + if (d != a+1) +return 0; + return 1; +} +main() { + exit(! does_int128_work()); +} + ], [have___int128=yes], [have___int128=no], [have___int128=no]) +AC_MSG_RESULT([$have___int128]) +if test x$have___int128 = xyes ; then + AC_DEFINE(HAVE___INT128, 1, [Define to 1 if the system has the type `__int128'.]) +fi + # We also check for sig_atomic_t, which *should* be defined per ANSI # C, but is missing on some old platforms. AC_CHECK_TYPES(sig_atomic_t, [], [], [#include signal.h]) diff --git a/src/backend/utils/adt/numeric.c b/src/backend/utils/adt/numeric.c index a8609e8..8344343 100644 --- a/src/backend/utils/adt/numeric.c +++ b/src/backend/utils/adt/numeric.c @@ -402,6 +402,9 @@ static void apply_typmod(NumericVar *var, int32 typmod); static int32 numericvar_to_int4(NumericVar *var); static bool numericvar_to_int8(NumericVar *var, int64 *result); static void int8_to_numericvar(int64 val, NumericVar *var); +#ifdef HAVE_INT128 +static void int16_to_numericvar(int128 val, NumericVar *var); +#endif static double numeric_to_double_no_overflow(Numeric num); static double numericvar_to_double_no_overflow(NumericVar *var); @@ -2659,6 +2662,9 @@ numeric_float4(PG_FUNCTION_ARGS) * Actually, it's a pointer to a NumericAggState allocated in the aggregate * context. The digit buffers for the NumericVars will be there too. * + * On platforms which support 128-bit integers some aggergates instead use a + * 128-bit integer based transition datatype to speed up calculations. + * * -- */ @@ -2917,6 +2923,65 @@ numeric_accum_inv(PG_FUNCTION_ARGS) PG_RETURN_POINTER(state); } +#ifdef HAVE_INT128 +typedef struct Int128AggState +{ + bool calcSumX2; /* if true, calculate sumX2 */ + int64 N; /* count of processed numbers */ + int128 sumX; /* sum of processed numbers */ + int128 sumX2; /* sum of squares of processed numbers */ +} Int128AggState; + +/* + * Prepare state data for a 128-bit aggregate function that needs to compute + * sum, count and optionally sum of squares of the input. + */ +static Int128AggState * +makeInt128AggState(FunctionCallInfo fcinfo, bool calcSumX2) +{ + Int128AggState *state; + MemoryContext agg_context; + MemoryContext old_context; + + if (!AggCheckCallContext(fcinfo, agg_context)) + elog(ERROR, aggregate function called in non-aggregate context); + + old_context = MemoryContextSwitchTo(agg_context); + + state = (Int128AggState *) palloc0(sizeof(Int128AggState)); + state-calcSumX2 = calcSumX2; + + MemoryContextSwitchTo(old_context); + + return state; +} + +/* + * Accumulate a new input value for 128-bit aggregate functions. + */ +static void
Re: [HACKERS] Back-branch update releases scheduled
On Thu, Jan 22, 2015 at 03:28:39PM -0800, Peter Geoghegan wrote: On Thu, Jan 22, 2015 at 3:23 PM, Tom Lane t...@sss.pgh.pa.us wrote: We're really rather overdue for updates of the pre-9.4 back branches, and 9.4 itself has probably baked for long enough to justify a 9.4.1. Accordingly, the core committee has agreed to wrap releases the week after next; that's wrap Mon Feb 2 for public announcement Thu Feb 5. I hope that gives us enough time to fix the issue in question in the 9.4.0 problem report thread hung backends stuck in spinlock heavy endless loop. The JSON/JSONB escape thing would be nice to fix too, before the existing behavior becomes too baked into applications. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
* Bruce Momjian (br...@momjian.us) wrote: On Fri, Jan 23, 2015 at 01:19:33AM +0100, Andres Freund wrote: Or do you - as the text edited in your patch, but not the quote above - mean to run pg_upgrade just on the primary and then rsync? No, I was going to run it on both, then rsync. I'm pretty sure this is all a lot easier than you believe it to be. If you want to recreate what pg_upgrade does to a cluster then the simplest thing to do is rsync before removing any of the hard links. rsync will simply recreate the same hard link tree that pg_upgrade created when it ran, and update files which were actually changed (the catalog tables). The problem, as mentioned elsewhere, is that you have to checksum all the files because the timestamps will differ. You can actually get around that with rsync if you really want though- tell it to only look at file sizes instead of size+time by passing in --size-only. I have to admit that for *my* taste, at least, that's getting pretty darn optimistic. It *should* work, but I'd definitely recommend testing it about a billion times in various ways before trusting it or recommending it to anyone else. I expect you'd need --inplace also, for cases where the sizes are different and rsync wants to modify the file on the destination to match the one on the source. Thanks, Stephen signature.asc Description: Digital signature
Re: [HACKERS] Using 128-bit integers for sum, avg and statistics aggregates
On 23/01/15 00:40, Andreas Karlsson wrote: - Renamed some things from int12 to int128, there are still some places with int16 which I am not sure what to do with. I'd vote for renaming them to int128 too, there is enough C functions that user int16 for 16bit integer that this is going to be confusing otherwise. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On 1/22/15 8:54 PM, Stephen Frost wrote: The problem, as mentioned elsewhere, is that you have to checksum all the files because the timestamps will differ. You can actually get around that with rsync if you really want though- tell it to only look at file sizes instead of size+time by passing in --size-only. I have to admit that for *my* taste, at least, that's getting pretty darn optimistic. It *should* work, but I'd definitely recommend testing it about a billion times in various ways before trusting it or recommending it to anyone else. I expect you'd need --inplace also, for cases where the sizes are different and rsync wants to modify the file on the destination to match the one on the source. I would definitely not feel comfortable using --size-only. In addition, there is a possible race condition in rsync where a file that is modified in the same second after rsync starts to copy will not be picked up in a subsequent rsync unless --checksum is used. This is fairly easy to prove and is shown here: https://github.com/pgmasters/backrest/blob/dev/test/lib/BackRestTest/BackupTest.pm#L1667 That means the rsync hot, then rsync cold method of updating a standby is not *guaranteed* to work unless checksums are used. This may seem like an edge case, but for a small, active database it looks like it could be a real issue. -- - David Steele da...@pgmasters.net -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parallel Seq Scan
On 01/22/2015 05:53 AM, Robert Haas wrote: Also, I think testing with 2 workers is probably not enough. I think we should test with 8 or even 16. FWIW, based on my experience there will also be demand to use parallel query using 4 workers, particularly on AWS. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
* David Steele (da...@pgmasters.net) wrote: On 1/22/15 8:54 PM, Stephen Frost wrote: The problem, as mentioned elsewhere, is that you have to checksum all the files because the timestamps will differ. You can actually get around that with rsync if you really want though- tell it to only look at file sizes instead of size+time by passing in --size-only. I have to admit that for *my* taste, at least, that's getting pretty darn optimistic. It *should* work, but I'd definitely recommend testing it about a billion times in various ways before trusting it or recommending it to anyone else. I expect you'd need --inplace also, for cases where the sizes are different and rsync wants to modify the file on the destination to match the one on the source. I would definitely not feel comfortable using --size-only. Yeah, it also occurs to me that if any of the catalog tables end up being the same size between the master and the replica that they wouldn't get copied and that'd make for one very interesting result, and not a good one. In addition, there is a possible race condition in rsync where a file that is modified in the same second after rsync starts to copy will not be picked up in a subsequent rsync unless --checksum is used. This is fairly easy to prove and is shown here: https://github.com/pgmasters/backrest/blob/dev/test/lib/BackRestTest/BackupTest.pm#L1667 Right, though that isn't really an issue in this specific case- we're talking about post-pg_upgrade but before the upgraded cluster has actually been started, so nothing should be modifying these files. That means the rsync hot, then rsync cold method of updating a standby is not *guaranteed* to work unless checksums are used. This may seem like an edge case, but for a small, active database it looks like it could be a real issue. That's certainly a good point though. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] pg_upgrade and rsync
On 1/22/15 10:05 PM, Stephen Frost wrote: In addition, there is a possible race condition in rsync where a file that is modified in the same second after rsync starts to copy will not be picked up in a subsequent rsync unless --checksum is used. This is fairly easy to prove and is shown here: https://github.com/pgmasters/backrest/blob/dev/test/lib/BackRestTest/BackupTest.pm#L1667 Right, though that isn't really an issue in this specific case- we're talking about post-pg_upgrade but before the upgraded cluster has actually been started, so nothing should be modifying these files. Indeed. This was really directed more at what Bruce said: I am thinking the fix for standys would be similar to what we recommand for upgrades with link mode using a rsync-created copy, e.g. use rsync while the master is running to create a copy of the standby, then shut down the master and run rsync again. However, at that point, you might as well just take a base backup and be done with it. -- - David Steele da...@pgmasters.net signature.asc Description: OpenPGP digital signature
Re: [HACKERS] Proposal: knowing detail of config files via SQL
On Fri, Jan 23, 2015 at 3:49 AM, Tom Lane t...@sss.pgh.pa.us wrote: I know what the proposal is. What I am questioning is the use-case that justifies having us build and support all this extra mechanism. How often does anyone need to know what the next down variable value would be? I would say not that often as I think nobody changes the settings in configuration files every now and then, but it could still be meaningful in situations as described upthread by Sawada. I think similar to this, pg_reload_conf() or many such things will be used not that frequently, it seems to me that here important point to consider is that whether such a view could serve any purpose for real users? If we see multiple value for same config parameter was possible even previous to Alter System and there doesn't seem to be much need/demand for such a view and the reason could be that user has no way of changing the setting at file level with command, but now as we provide a way it could be useful in certain cases when user want to retain previous values (value in postgresql.conf). I understand that usage of such a view is not that high, but it could be meaningful in some cases. And if they do need to know whether a variable is set in postgresql.conf, how often wouldn't they just resort to grep instead? Do always user/dba (having superuser privilege) access to postgresql.conf file? It seems to me even if they have access, getting the information via SQL would be more appealing. (Among other points, grep would succeed at noticing commented-out entries, which this mechanism would not.) GUC has existed in more or less its current state for about 15 years, and I don't recall a lot of complaints that would be solved by this. Furthermore, given that ALTER SYSTEM was sold to us as more or less obsoleting manual editing of postgresql.conf, I rather doubt that it's changed the basis of discussion all that much. By providing such a view I don't mean to recommend the user to change the settings by editing postgresql.conf manually and then use Alter System for other cases, rather it could be used for some cases like for default values or if in rare cases user has changed the parameters manually. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
[HACKERS] Back-branch update releases scheduled
We're really rather overdue for updates of the pre-9.4 back branches, and 9.4 itself has probably baked for long enough to justify a 9.4.1. Accordingly, the core committee has agreed to wrap releases the week after next; that's wrap Mon Feb 2 for public announcement Thu Feb 5. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Back-branch update releases scheduled
On Thu, Jan 22, 2015 at 3:23 PM, Tom Lane t...@sss.pgh.pa.us wrote: We're really rather overdue for updates of the pre-9.4 back branches, and 9.4 itself has probably baked for long enough to justify a 9.4.1. Accordingly, the core committee has agreed to wrap releases the week after next; that's wrap Mon Feb 2 for public announcement Thu Feb 5. I hope that gives us enough time to fix the issue in question in the 9.4.0 problem report thread hung backends stuck in spinlock heavy endless loop. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Using 128-bit integers for sum, avg and statistics aggregates
On 01/11/2015 02:36 AM, Andres Freund wrote: a) Afaics only __int128/unsigned __int128 is defined. See https://gcc.gnu.org/onlinedocs/gcc/_005f_005fint128.html b) I'm doubtful that AC_CHECK_TYPES is a sufficiently good test on all platforms. IIRC gcc will generate calls to functions to do the actual arithmetic, and I don't think they're guranteed to be available on platforms. That how it .e.g. behaves for atomic operations. So my guess is that you'll need to check that a program that does actual arithmetic still links. c) Personally I don't see the point of testing __uint128_t. That's just a pointless test that makes configure run for longer. The next version will fix all these issues. Hm. It might be nicer to move the if (!state) elog() outside the ifdef, and add curly parens inside the ifdef. Since that change only really works for the *_inv functions I decided to leave them all as-is for consistency. --- a/src/include/pg_config.h.in +++ b/src/include/pg_config.h.in @@ -678,6 +678,12 @@ /* Define to 1 if your compiler understands __VA_ARGS__ in macros. */ #undef HAVE__VA_ARGS z +/* Define to 1 if the system has the type `__int128_t'. */ +#undef HAVE___INT128_T + +/* Define to 1 if the system has the type `__uint128_t'. */ +#undef HAVE___UINT128_T pg_config.h.win32 should be updated as well. Will be fixed in the next version. -- Andreas Karlsson -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On Thu, Jan 22, 2015 at 10:48:37PM +0200, Heikki Linnakangas wrote: * If we need to protect hint bit updates from torn writes, WAL-log a * full page image of the page. This full page image is only necessary * if the hint bit update is the first change to the page since the * last checkpoint. * * We don't check full_page_writes here because that logic is included * when we call XLogInsert() since the value changes dynamically. */ if (XLogHintBitIsNeeded() (bufHdr-flags BM_PERMANENT)) { /* * If we're in recovery we cannot dirty a page because of a hint. * We can set the hint, just not dirty the page as a result so the * hint is lost when we evict the page or shutdown. * * See src/backend/storage/page/README for longer discussion. */ if (RecoveryInProgress()) return; What if XLogHintBitIsNeeded is false? That would be the case if we're not wall logging hints *on the standby*. Then the page will be updated without writing a WAL record. Just like in the master, if wal_log_hints is off. wal_log_hints works the same on the master or the standby. [ see below for why this entire idea might not work ] OK, I was confused by your previous no. It means we do update hint bits on read-only slave queries --- we just don't WAL log them. In fact, we can't update hint bits on the standby if we are wal logging them is that right? My text was saying: these differences can be reduced by using a fresh standby and by enabling xref linkend=guc-wal-log-hints. (While varnamewal_log_hints/ transfers hint bits from the primary to standbys, additional hint bits are still set on the standbys by read-only queries.) meaning if you don't run any read-only queries on the standby, the files will be same on master/standby because the hint bits will be the same, and rsync will not copy the files. This brings up the other problem that the mod times of the files are likely to be different between master and slave --- should I recommend to only use rsync --checksum? I would really like to get a way to pg_upgrade the standbys but we have never really be able to get a solution. Ideally we would update just the system table files, and if the order of pg_upgrade file renames is exactly the same, everything else would match, but I can't imagine what such an API would look like. Have pg_upgrade spit out a list of files to be copied? In fact, these are the relfilenodes pg_upgrade preserves: * While pg_class.oid and pg_class.relfilenode are initially the same * in a cluster, they can diverge due to CLUSTER, REINDEX, or VACUUM * FULL. In the new cluster, pg_class.oid and pg_class.relfilenode will * be the same and will match the old pg_class.oid value. Because of * this, old/new pg_class.relfilenode values will not match if CLUSTER, * REINDEX, or VACUUM FULL have been performed in the old cluster. * * We control all assignments of pg_type.oid because these oids are stored * in user composite type values. * * We control all assignments of pg_enum.oid because these oids are stored * in user tables as enum values. * * We control all assignments of pg_authid.oid because these oids are stored * in pg_largeobject_metadata. so if the table/index relfilenodes no longer match the oid on the old cluster, due to CLUSTER, REINDEX, or VACUUM FULL, the file name will not match on the new cluster and rsync will copy the entire file. In fact, rsync is going to copy it to the wrong file name, and delete the right file. I am going to now conclude that rsync is never going to work for this, unless we have pg_upgrade preserve relfilenodes as well. However, I am not even sure that is possible due to conflicts with system table relfilenodes created in the new cluster. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Back-branch update releases scheduled
Bruce Momjian br...@momjian.us writes: On Thu, Jan 22, 2015 at 03:28:39PM -0800, Peter Geoghegan wrote: On Thu, Jan 22, 2015 at 3:23 PM, Tom Lane t...@sss.pgh.pa.us wrote: We're really rather overdue for updates of the pre-9.4 back branches, and 9.4 itself has probably baked for long enough to justify a 9.4.1. Accordingly, the core committee has agreed to wrap releases the week after next; that's wrap Mon Feb 2 for public announcement Thu Feb 5. I hope that gives us enough time to fix the issue in question in the 9.4.0 problem report thread hung backends stuck in spinlock heavy endless loop. The JSON/JSONB escape thing would be nice to fix too, before the existing behavior becomes too baked into applications. So let's get on with it. We're not holding up releases for patches that may or may not materialize. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Back-branch update releases scheduled
On Thu, Jan 22, 2015 at 3:51 PM, Tom Lane t...@sss.pgh.pa.us wrote: So let's get on with it. We're not holding up releases for patches that may or may not materialize. I don't disagree. Just pointing out that it's a consideration. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On 1/22/15 5:43 PM, Bruce Momjian wrote: This brings up the other problem that the mod times of the files are likely to be different between master and slave --- should I recommend to only use rsync --checksum? I don't think so. AIUI if the timestamps are different the very next thing it does is run the checksum (which is expensive). So --checksum is just going to hurt. I am going to now conclude that rsync is never going to work for this, unless we have pg_upgrade preserve relfilenodes as well. However, I am not even sure that is possible due to conflicts with system table relfilenodes created in the new cluster. We've previously talked about required steps before an upgrade; perhaps we need a way to force an OID/relfilenode change on the old server prior to upgrade. Or, thinking outside the box here... could this type of stuff be done in postgres itself so we could generate wal that's shipped to standby's? That would allow doing this as part of the formal upgrade process without the need for preliminary steps. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On 2015-01-22 14:20:51 -0500, Bruce Momjian wrote: It is possible to upgrade on pg_upgrade on streaming standby servers by making them master servers, running pg_upgrade on them, then shuting down all servers and using rsync to make the standby servers match the real master. Isn't that a pretty crazy procedure? If you need to shut down all servers anyway, you can just rsync after having run pg_upgrade on the master, no? Rsync won't really transfer less just because you ran a similar thing on the standby. Even if this would allow to avoid some traffic for fsync: There's absolutely no guarantee that the standby's pg_upgrade results in a all that similar data directory. Far from everything in postgres is deterministic - it's easy to hit timing differences that result in noticeable differences. Or do you - as the text edited in your patch, but not the quote above - mean to run pg_upgrade just on the primary and then rsync? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: knowing detail of config files via SQL
On 01/22/2015 02:09 PM, David Johnston wrote: The proposal provides for SQL access to all possible sources of variable value setting and, ideally, a means of ordering them in priority order, so that a search for TimeZone would return two records, one for postgresql.auto.conf and one for postgresql.conf - which are numbered 1 and 2 respectively - so that in looking at that result if the postgresql.auto.conf entry were to be removed the user would know that what the value is in postgresql.conf that would become active. Furthermore, if postgresql.conf has a setting AND there is a mapping in an #included file that information would be accessible via SQL as well. Wow. Um, I can't imagine any use for that which would justify the overhead. And I'm practically the settings geek. Note that a single file can have multiple copies of the same GUC, plus there's GUCs set interactively, as well as in the user and database properties. So you're looking at a lot of different versions. I think you're in a position of needing to interest someone else in this issue enough to produce a patch to argue about. I'm not seeing a lot of interest in it here. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On Thu, Jan 22, 2015 at 06:04:24PM -0600, Jim Nasby wrote: On 1/22/15 5:43 PM, Bruce Momjian wrote: This brings up the other problem that the mod times of the files are likely to be different between master and slave --- should I recommend to only use rsync --checksum? I don't think so. AIUI if the timestamps are different the very next thing it does is run the checksum (which is expensive). So --checksum is just going to hurt. Oh, OK, good. I am going to now conclude that rsync is never going to work for this, unless we have pg_upgrade preserve relfilenodes as well. However, I am not even sure that is possible due to conflicts with system table relfilenodes created in the new cluster. We've previously talked about required steps before an upgrade; perhaps we need a way to force an OID/relfilenode change on the old server prior to upgrade. Actually, the idea I had forgotten is that we are not rsyncing between old and new clusters here, but between two servers who are both new after running pg_upgrade. Their relfilenodes match their oid, and the oids match, so we should be fine. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Minor issues with code comments related to abbreviated keys
Attached patch fixes minor issues in code comments that relate to abbreviated keys. -- Peter Geoghegan diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c index c79b641..cfa1921 100644 --- a/src/backend/utils/adt/varlena.c +++ b/src/backend/utils/adt/varlena.c @@ -2088,7 +2088,7 @@ bttext_abbrev_convert(Datum original, SortSupport ssup) * * First, Hash key proper, or a significant fraction of it. Mix in length * in order to compensate for cases where differences are past - * CACHE_LINE_SIZE bytes, so as to limit the overhead of hashing. + * PG_CACHE_LINE_SIZE bytes, so as to limit the overhead of hashing. */ hash = hash_any((unsigned char *) authoritative_data, Min(len, PG_CACHE_LINE_SIZE)); diff --git a/src/include/utils/sortsupport.h b/src/include/utils/sortsupport.h index 62fedfa..44c596f 100644 --- a/src/include/utils/sortsupport.h +++ b/src/include/utils/sortsupport.h @@ -165,8 +165,8 @@ typedef struct SortSupportData * may set it to NULL to indicate abbreviation should not be used (which is * something sortsupport routines need not concern themselves with). * However, sortsupport routines must not set it when it is immediately - * established that abbreviation should not proceed (for abbreviation - * calls, or platform-specific impediments to using abbreviation). + * established that abbreviation should not proceed (e.g., for !abbreviate + * calls, or due to platform-specific impediments to using abbreviation). */ Datum (*abbrev_converter) (Datum original, SortSupport ssup); -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On Fri, Jan 23, 2015 at 01:19:33AM +0100, Andres Freund wrote: On 2015-01-22 14:20:51 -0500, Bruce Momjian wrote: It is possible to upgrade on pg_upgrade on streaming standby servers by making them master servers, running pg_upgrade on them, then shuting down all servers and using rsync to make the standby servers match the real master. Isn't that a pretty crazy procedure? If you need to shut down all Yes, it is crazy, but so is pg_upgrade. ;-) servers anyway, you can just rsync after having run pg_upgrade on the master, no? Rsync won't really transfer less just because you ran a similar thing on the standby. Uh, yeah, it will, because the files can get renamed as part of the upgrade (relfilenode now matches oid), so by running the upgrade, file names are going to match up. I didn't think rsync could handle renaming of files without recopying the entire file. Even if this would allow to avoid some traffic for fsync: There's absolutely no guarantee that the standby's pg_upgrade results in a all that similar data directory. Far from everything in postgres is deterministic - it's easy to hit timing differences that result in noticeable differences. Right, some non-deterministic things would change, but I thought runnning upgrade on the standby would help. However, now that I think of it, we don't preserver the database directory name and assume dbs will will get the same oid and therefore same database directory name on both, but if you use -j, things are going to happen in random order. Oops. Oh well. Or do you - as the text edited in your patch, but not the quote above - mean to run pg_upgrade just on the primary and then rsync? No, I was going to run it on both, then rsync. I am thinking the fix for standys would be similar to what we recommand for upgrades with link mode using a rsync-created copy, e.g. use rsync while the master is running to create a copy of the standby, then shut down the master and run rsync again. However, at that point, you might as well just take a base backup and be done with it. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] tracking commit timestamps
On 05/01/15 17:50, Petr Jelinek wrote: On 05/01/15 16:17, Petr Jelinek wrote: On 05/01/15 07:28, Fujii Masao wrote: On Thu, Dec 4, 2014 at 12:08 PM, Fujii Masao masao.fu...@gmail.com wrote: On Wed, Dec 3, 2014 at 11:54 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote: Pushed with some extra cosmetic tweaks. I got the following assertion failure when I executed pg_xact_commit_timestamp() in the standby server. =# select pg_xact_commit_timestamp('1000'::xid); TRAP: FailedAssertion(!(((oldestCommitTs) != ((TransactionId) 0)) == ((newestCommitTs) != ((TransactionId) 0))), File: commit_ts.c, Line: 315) server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: 2014-12-04 12:01:08 JST sby1 LOG: server process (PID 15545) was terminated by signal 6: Aborted 2014-12-04 12:01:08 JST sby1 DETAIL: Failed process was running: select pg_xact_commit_timestamp('1000'::xid); The way to reproduce this problem is #1. set up and start the master and standby servers with track_commit_timestamp disabled #2. enable track_commit_timestamp in the master and restart the master #3. run some write transactions #4. enable track_commit_timestamp in the standby and restart the standby #5. execute select pg_xact_commit_timestamp('1000'::xid) in the standby BTW, at the step #4, I got the following log messages. This might be a hint for this problem. LOG: file pg_commit_ts/ doesn't exist, reading as zeroes CONTEXT: xlog redo Transaction/COMMIT: 2014-12-04 12:00:16.428702+09; inval msgs: catcache 59 catcache 58 catcache 59 catcache 58 catcache 45 catcache 44 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 catcache 7 catcache 6 snapshot 2608 relcache 16384 This problem still happens in the master. Regards, Attached patch fixes it, I am not sure how happy I am with the way I did it though. Added more comments and made the error message bit clearer. Fujii, Alvaro, did one of you had chance to look at this fix? -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Proposal: knowing detail of config files via SQL
On 1/22/15 11:13 AM, Sawada Masahiko wrote: Hi, As per discussion http://www.postgresql.org/message-id/cad21aodkds8oqbr199wwrcp7fidvow6bbb+cgdwqhuf+gx_...@mail.gmail.com, I would like to proposal new view like pg_file_settings to know detail of config file via SQL. - Background In 9.4 postgresql.auto.conf is added to support ALTER SYSTEM command and that config file is loaded after whenever postgresql.conf is loaded. That is, postgresql.auto.conf is quite high priority so that the value in postgresql.conf can not work at all if DBA set it manually. ALTER SYSTEM RESET command can remove the unnecessary value in postgresql.auto.conf but there are no way to know about where the value has came from. (They can only give the information about the setting in last file it is present.) - Solution The patch not is implemented yet, just proposing now. I'm imaging that we can have new pg_file_settings view has following column to store current assigned value in config file. - guc value name - guc value - config file path (e.g. /opt/data/postgresql.sql, /opt/data/postgresql.auto.conf, /opt/hoge.conf) This view could be convenient for DBA to decide if the postgresql.auto.conf is useful or not and if it's not useful then DBA could use ALTER SYSTEM .. RESET command to remove the same from postgresql.auto.conf. Would this view have a row for every option in a config file? IE: if you set something in both postgresql.conf and postgresql.auto.conf, would it show up twice? I think it should, and that there should be a way to see which setting is actually in effect. It looks like you're attempting to handle #include, yes? Also other idea is to add additional columns existing view (pg_settings), according to prev discussion. I think it would be useful to have a separate view that shows all occurrences of a setting. I recall some comment about source_file and source_line not always being correct in pg_settings; if that's true we should fix that. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] pg_upgrade and rsync
It is possible to upgrade on pg_upgrade on streaming standby servers by making them master servers, running pg_upgrade on them, then shuting down all servers and using rsync to make the standby servers match the real master. However, Josh Berkus reported that he found that hint bits set on the master server but not on the standby servers made rsync too slow. The attached pg_upgrade doc patch recommends using wal_log_hints to make hint bits on the standby match the master. One question I have is whether hint bits are set by read-only transactions on standby servers. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + diff --git a/doc/src/sgml/pgupgrade.sgml b/doc/src/sgml/pgupgrade.sgml new file mode 100644 index e1cd260..07cb1f9 *** a/doc/src/sgml/pgupgrade.sgml --- b/doc/src/sgml/pgupgrade.sgml *** psql --username postgres --file script.s *** 553,559 is to upgrade the primary and use commandrsync/ to rebuild the standbys. You can run commandrsync/ while the primary is down, or as part of a base backup (xref linkend=backup-base-backup) !which overwrites the old standby cluster. /para para --- 553,565 is to upgrade the primary and use commandrsync/ to rebuild the standbys. You can run commandrsync/ while the primary is down, or as part of a base backup (xref linkend=backup-base-backup) !which overwrites the old standby cluster. Hint bit differences !between the primary and standby can cause commandrsync/ to copy !additional data mdash; these differences can be reduced by using !a fresh standby and by enabling xref linkend=guc-wal-log-hints. !(While varnamewal_log_hints/ transfers hint bits from the primary !to standbys, additional hint bits are still set on the standbys by !read-only queries.) /para para -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] basebackups during ALTER DATABASE ... SET TABLESPACE ... not safe?
Andres Freund wrote: 2) Make movedb() (and possibly created(), not sure yet) use proper WAL logging and log the whole copied data. I think this is the right long term fix and would end up being much more reliable. But it either requires some uglyness during redo (creating nonexistant database directories on the fly during redo) or new wal records. Doable, but probably too large/invasive to backpatch. Not to mention the extra WAL traffic ... -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] proposal: plpgsql - Assert statement
Hi here is updated patch 2015-01-21 23:28 GMT+01:00 Jim Nasby jim.na...@bluetreble.com: On 1/21/15 3:10 PM, Pavel Stehule wrote: is there some agreement on this behave of ASSERT statement? I would to assign last patch to next commitfest. Possible changes: * I would to simplify a behave of evaluating of message expression - probably I disallow NULL there. Well, the only thing I could see you doing there is throwing a different error if the hint is null. I don't see that as an improvement. I'd just leave it as-is. I enabled a NULL - but enforced a WARNING before. * GUC enable_asserts will be supported That would be good. Would that allow for enabling/disabling on a per-function basis too? sure - there is only question if we develop a #option enable|disable_asserts. I have no string idea. * a assert exception should not be handled by PLpgSQL handler -- like CANCEL exception +1 -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com commit 60f85cb5f284bf812ee73d321850d25313d19d65 Author: Pavel Stehule pavel.steh...@gooddata.com Date: Thu Jan 22 20:56:31 2015 +0100 initial diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 6bcb106..5663983 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -6912,6 +6912,20 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir' variablelist + varlistentry id=guc-enable-user-asserts xreflabel=enable_user_asserts + termvarnameenable_user_asserts/varname (typeboolean/type) + indexterm + primaryvarnameenable_user_asserts/ configuration parameter/primary + /indexterm + /term + listitem + para +If true, any user assertions are evaluated. By default, this +is set to true. + /para + /listitem + /varlistentry + varlistentry id=guc-exit-on-error xreflabel=exit_on_error termvarnameexit_on_error/varname (typeboolean/type) indexterm diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml index 69a0885..26f7eba 100644 --- a/doc/src/sgml/plpgsql.sgml +++ b/doc/src/sgml/plpgsql.sgml @@ -3372,6 +3372,9 @@ END LOOP optional replaceablelabel/replaceable /optional; sect1 id=plpgsql-errors-and-messages titleErrors and Messages/title + sect2 id=plpgsql-statements-raise +titleRAISE statement/title + indexterm primaryRAISE/primary /indexterm @@ -3564,7 +3567,33 @@ RAISE unique_violation USING MESSAGE = 'Duplicate user ID: ' || user_id; the whole category. /para /note + /sect2 + + sect2 id=plpgsql-statements-assert +titleASSERT statement/title + indexterm +primaryASSERT/primary + /indexterm + + indexterm +primaryassertions/primary +secondaryin PL/pgSQL/secondary + /indexterm + + para +Use the commandASSERT/command statement to ensure so expected +predicate is allways true. If the predicate is false or is null, +then a assertion exception is raised. + +synopsis +ASSERT replaceable class=parameterexpression/replaceable optional, replaceable class=parametermessage expression/replaceable /optional; +/synopsis + +The user assertions can be enabled or disabled by the +xref linkend=guc-enable-user-asserts. + /para + /sect2 /sect1 sect1 id=plpgsql-trigger diff --git a/src/backend/utils/errcodes.txt b/src/backend/utils/errcodes.txt index 28c8c40..da12428 100644 --- a/src/backend/utils/errcodes.txt +++ b/src/backend/utils/errcodes.txt @@ -454,6 +454,7 @@ PEERRCODE_PLPGSQL_ERROR plp P0001EERRCODE_RAISE_EXCEPTIONraise_exception P0002EERRCODE_NO_DATA_FOUND no_data_found P0003EERRCODE_TOO_MANY_ROWS too_many_rows +P0004EERRCODE_ASSERT_EXCEPTION assert_exception Section: Class XX - Internal Error diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c index c35867b..cd55e94 100644 --- a/src/backend/utils/init/globals.c +++ b/src/backend/utils/init/globals.c @@ -99,6 +99,7 @@ bool IsBinaryUpgrade = false; bool IsBackgroundWorker = false; bool ExitOnAnyError = false; +bool enable_user_asserts = true; int DateStyle = USE_ISO_DATES; int DateOrder = DATEORDER_MDY; diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index f6df077..b3a2660 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -980,6 +980,15 @@ static struct config_bool ConfigureNamesBool[] = }, { + {enable_user_asserts, PGC_USERSET, ERROR_HANDLING_OPTIONS, + gettext_noop(Enable user asserts checks.), + NULL + }, + enable_user_asserts, + true, + NULL, NULL, NULL + }, + { {exit_on_error, PGC_USERSET, ERROR_HANDLING_OPTIONS,
Re: [HACKERS] Windows buildfarm animals are still not happy with abbreviated keys patch
On Thu, Jan 22, 2015 at 11:28 AM, Robert Haas robertmh...@gmail.com wrote: This seems to have broken more stuff. My working hypothesis is that the culprit is here: /* * There is no special handling of the C locale here, unlike with * varstr_cmp(). strxfrm() is used indifferently. */ As far as I can see, that's just hoping that when the locale is C, strxfrm() is memcpy(). If it isn't, then you will potentially get different results when the abbreviated keys stuff is used vs. when it isn't, because when it isn't, we're going to memcmp(), and when it is, we're going to memcmp() the results of strxfrm(). As noted also on the thread Kevin started, I've now pushed a fix for this and am eagerly awaiting new buildfarm returns. I wonder if this might account for the ordering failures we were seeing on Windows before. We thought it was a Windows-specific issue, but I bet the real problem was here: if (collid != DEFAULT_COLLATION_OID) { if (!OidIsValid(collid)) { /* * This typically means that the parser could not resolve a * conflict of implicit collations, so report it that way. */ ereport(ERROR, (errcode(ERRCODE_INDETERMINATE_COLLATION), errmsg(could not determine which collation to use for string comparison), errhint(Use the COLLATE clause to set the collation explicitly.))); } #ifdef HAVE_LOCALE_T tss-locale = pg_newlocale_from_collation(collid); #endif } Before the abbreviated keys commit, this code only ran if we'd already determined that lc_collate_is_c(collid) was false. But that commit made this also run when that value was true. So if HAVE_LOCALE_T and collid != DEFAULT_COLLATION_ID, then tss-locale was getting set. If it got set to something that made strxfrm() behave like memcmp(), then all was well. If not, then life was bad. Now you'd hope that would work out, but maybe not. Also, suppose HAVE_LOCALE_T was defined but collid == DEFAULT_COLLATION_ID. Then tss-locale didn't get initialized at all, and bttext_abbrev_convert() just passed whatever stupid value it found there to strxfrm_l(). Finally, suppose we *didn't* HAVE_LOCALE_T. Then, the non-abbreviated comparator knew it should use memcmp(), but the abbreviated comparator was happy to use strxfrm() even though that would use the current locale, but the C locale that the user requested. Long story short: this was broken. It may be that when the dust settles we can try re-enabling this on Windows. It might work now that this issue is (hopefully) fixed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] collate test now failing
On Thu, Jan 22, 2015 at 12:07 PM, Bruce Momjian br...@momjian.us wrote: Back to watching the buildfarm returns roll in... Does this explain the Windows failures too? The Windows machines have mostly been failing since the original abbreviated keys patch went in. See the message I just posted on the Windows buildfarm animals are still not happy with abbreviated keys patch thread for a fuller analysis. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] collate test now failing
Kevin Grittner kgri...@ymail.com wrote: I think this may have just started with: b529b65d1bf8537ca7fa024760a9782d7c8b66e5 Confirmed that I get a clean check on the prior commit. ubuntu 14.04 LTS -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] collate test now failing
On Thu, Jan 22, 2015 at 11:50 AM, Kevin Grittner kgri...@ymail.com wrote: Kevin Grittner kgri...@ymail.com wrote: I think this may have just started with: b529b65d1bf8537ca7fa024760a9782d7c8b66e5 Confirmed that I get a clean check on the prior commit. Can you check whether this fixes it? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company no-strxfrm-for-c.patch Description: binary/octet-stream -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Additional role attributes superuser review
Adam, all, * Adam Brightwell (adam.brightw...@crunchydatasolutions.com) wrote: After re-reading through this thread is seems like EXCLUSIVEBACKUP (proposed by Magnus) seemed to be a potentially acceptable alternative. I just chatted a bit on IRC w/ Magnus about this and I'm on-board with his proposal to use EXCLUSIVEBACKUP, but there's a couple additional things which need doing as part of that: We need to actually define what an 'exclusive backup' is in the documentation- it's not there today. pg_start_backup/pg_stop_backup docs should be updated to refer to those operations as relating to on-line exclusive backup, same as pg_is_in_backup() and pg_backup_start_time() do. The docs for the EXCLUSIVEBACKUP attribute should, of course, refer to where EXCLUSIVE BACKUP is defined. Hopefully that wraps up the last of the debate around the naming of these options and we can move forward, once the patch is updated to reflect this and docs are written. Would be great to hear from anyone who feels otherwise. Regarding the pg_dump/read-only/etc discussion- that wasn't intended to be part of this and I continue to feel it'd be best addressed on a new thread specifically for that. Once we have this initial round of role attribute additions settled, I'd be more than happy to support that discussion and see what we can do to provide such a role. Thanks! Stephen signature.asc Description: Digital signature
Re: [HACKERS] collate test now failing
On Thu, Jan 22, 2015 at 12:04:14PM -0500, Robert Haas wrote: On Thu, Jan 22, 2015 at 11:53 AM, Robert Haas robertmh...@gmail.com wrote: On Thu, Jan 22, 2015 at 11:50 AM, Kevin Grittner kgri...@ymail.com wrote: Kevin Grittner kgri...@ymail.com wrote: I think this may have just started with: b529b65d1bf8537ca7fa024760a9782d7c8b66e5 Confirmed that I get a clean check on the prior commit. Can you check whether this fixes it? Kevin says (via IM) that it does, but with a compiler warning. Fixed that and pushed. Back to watching the buildfarm returns roll in... Does this explain the Windows failures too? -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] collate test now failing
I think this may have just started with: b529b65d1bf8537ca7fa024760a9782d7c8b66e5 -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company regression.diffs Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Proposal: knowing detail of config files via SQL
Hi, As per discussion http://www.postgresql.org/message-id/cad21aodkds8oqbr199wwrcp7fidvow6bbb+cgdwqhuf+gx_...@mail.gmail.com, I would like to proposal new view like pg_file_settings to know detail of config file via SQL. - Background In 9.4 postgresql.auto.conf is added to support ALTER SYSTEM command and that config file is loaded after whenever postgresql.conf is loaded. That is, postgresql.auto.conf is quite high priority so that the value in postgresql.conf can not work at all if DBA set it manually. ALTER SYSTEM RESET command can remove the unnecessary value in postgresql.auto.conf but there are no way to know about where the value has came from. (They can only give the information about the setting in last file it is present.) - Solution The patch not is implemented yet, just proposing now. I'm imaging that we can have new pg_file_settings view has following column to store current assigned value in config file. - guc value name - guc value - config file path (e.g. /opt/data/postgresql.sql, /opt/data/postgresql.auto.conf, /opt/hoge.conf) This view could be convenient for DBA to decide if the postgresql.auto.conf is useful or not and if it's not useful then DBA could use ALTER SYSTEM .. RESET command to remove the same from postgresql.auto.conf. Also other idea is to add additional columns existing view (pg_settings), according to prev discussion. Please give me comments. Regards, --- Sawada Masahiko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Merging postgresql.conf and postgresql.auto.conf
On Thu, Jan 22, 2015 at 11:41 AM, Amit Kapila amit.kapil...@gmail.com wrote: On Wed, Jan 21, 2015 at 9:43 PM, Sawada Masahiko sawada.m...@gmail.com wrote: On Wed, Jan 21, 2015 at 3:38 PM, Amit Kapila amit.kapil...@gmail.com wrote: On Tue, Jan 20, 2015 at 9:38 PM, Robert Haas robertmh...@gmail.com wrote: The reason why sourcefile and sourceline are not sufficient is that they can only give the information about the setting in last file it is present. Assume max_connections (or any other setting) is available in both postgresql.conf and postgresql.auto.conf, then it will display the information about the setting in postgresql.auto.conf, so now user might not be able to decide whether that is the setting he want to retain unless he knows the information about setting in postgresql.conf. Now as I have suggested upthread, that we can have a new view pg_file_settings which will display information about settings even when there exists multiple entries for the same in different files. I think adding such information to existing view pg_settings would be difficult as the same code is used for show commands which we don't want to change. I think this new view is updated only when postmaster received SIGHUP or is started. And we can have new function like pg_update_file_setting() which updates this view. If that is doable without much complication, then it might not be bad idea to just add additional columns to existing view (pg_settings). I think you can once evaluate the details like what additional columns (other than what pg_settings has) are required and how you want to update them. After doing so further discussion could be more meaningful. I have posted mail as another thread to discuss about this. cad21aoa8_mejmndhcekt8cdyfup8fpboxap0n0-lxervv7f...@mail.gmail.com Regards, --- Sawada Masahiko -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Merging postgresql.conf and postgresql.auto.conf
On Wed, Jan 21, 2015 at 1:32 AM, David Johnston david.g.johns...@gmail.com wrote: to make the whole thing work. Maybe it does all fit directly on pg_settings but tacking on some read-only columns to this updateable view/table doesn't come across as something that should be forbidden in general. No, of course not. But they should be things that are of general utility (we agree that most people will want them) and they should be orthogonal to what we already have (not just a duplication of something that's present elsewhere). Maybe I am imagining a use-case that just isn't there but if there are two separate queries needed, and we call one consolidated, then having that query indicate whether the other query has useful information, and it is quite likely that it will not, avoids the user expending the effort to run the wasteful secondary query. Well, we shouldn't assume that everyone uses the same queries, or issues them in the same order, so I think this is pretty speculative. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] New CF app deployment
Hi, Thanks for the excellent work on the new commitfest app. It looks awesome so far, though I'm betting the commitfest manager is the one who reaps the most benefits. Wanted to highlight this request: Andres Freund wrote: What I'm also missing from the old app is that previously 'reviews' could explicitly be linked from the app. Now there's a list of emails in the thread, nice!, but in bigger threads that really doesn't help to find the actual review. Note for instance the BRIN inclusion operator class patch here, https://commitfest.postgresql.org/3/31/ The link points to the generic BRIN thread I started (you can see it in the history). If you list all attachments you can see that all the BRIN patches are linked; Emre's patch is there, it seems, only because it's the one most recently posted. Not sure what to do here, but it's somewhat annoying. Also, I was pretty used to offline operation with the old one: I could load the page, for instance https://commitfest-old.postgresql.org/action/commitfest_view?id=25 and the patch links alongside each patch had the message-ids with which I could search my local mbox. In the new one, the only way I can get the message-id is by opening the patch page. It would be pretty useful to have a copy message-id to clipboard button, or something similar, so that I could transfer operation from the web browser to my mail client. Having one message-id per patch in the commitfest summary page is enough for my use case (probably pointing to the most recent attachment in the linked threads.) -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] BRIN range operator class
Can you please break up this patch? I think I see three patches, 1. add sql-callable functions such as inet_merge, network_overright, etc etc. These need documentation and a trivial regression test somewhere. 2. necessary changes to header files (skey.h etc) 3. the inclusion opclass itself Thanks BTW the main idea behind having opcinfo return the type oid was to tell the index what was stored in the index. If that doesn't work right now, maybe it needs some tweak to the brin framework code. -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] basebackups during ALTER DATABASE ... SET TABLESPACE ... not safe?
On 1/22/15 1:43 PM, Alvaro Herrera wrote: Andres Freund wrote: 2) Make movedb() (and possibly created(), not sure yet) use proper WAL logging and log the whole copied data. I think this is the right long term fix and would end up being much more reliable. But it either requires some uglyness during redo (creating nonexistant database directories on the fly during redo) or new wal records. Doable, but probably too large/invasive to backpatch. Not to mention the extra WAL traffic ... Yeah, I don't know that we actually want #2. It's bad enough to copy an entire database locally, but to then put it's entire contents into WAL? Blech. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On 01/22/2015 09:20 PM, Bruce Momjian wrote: One question I have is whether hint bits are set by read-only transactions on standby servers. No. See comments in MarkBufferDirtyHint: /* * If we need to protect hint bit updates from torn writes, WAL-log a * full page image of the page. This full page image is only necessary * if the hint bit update is the first change to the page since the * last checkpoint. * * We don't check full_page_writes here because that logic is included * when we call XLogInsert() since the value changes dynamically. */ if (XLogHintBitIsNeeded() (bufHdr-flags BM_PERMANENT)) { /* * If we're in recovery we cannot dirty a page because of a hint. * We can set the hint, just not dirty the page as a result so the * hint is lost when we evict the page or shutdown. * * See src/backend/storage/page/README for longer discussion. */ if (RecoveryInProgress()) return; - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Windows buildfarm animals are still not happy with abbreviated keys patch
On Thu, Jan 22, 2015 at 1:00 PM, Peter Geoghegan p...@heroku.com wrote: On Thu, Jan 22, 2015 at 9:55 AM, Robert Haas robertmh...@gmail.com wrote: Stay tuned for more exciting dispatches from the department of abbreviated keys! I certainly suspected that we'd have problems with Windows, but nothing this bad. This isn't really Windows-specific. The root of the problem is that when LC_COLLATE=C you were trying to use strxfrm() for the abbreviated key even though memcmp() is the authoritative comparator in that case. Exactly which platforms happened to blow up as a result of that is kind of beside the point. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On 1/22/15 2:19 PM, Heikki Linnakangas wrote: On 01/22/2015 09:20 PM, Bruce Momjian wrote: One question I have is whether hint bits are set by read-only transactions on standby servers. No. See comments in MarkBufferDirtyHint: /* * If we need to protect hint bit updates from torn writes, WAL-log a * full page image of the page. This full page image is only necessary * if the hint bit update is the first change to the page since the * last checkpoint. * * We don't check full_page_writes here because that logic is included * when we call XLogInsert() since the value changes dynamically. */ if (XLogHintBitIsNeeded() (bufHdr-flags BM_PERMANENT)) { /* * If we're in recovery we cannot dirty a page because of a hint. * We can set the hint, just not dirty the page as a result so the * hint is lost when we evict the page or shutdown. * * See src/backend/storage/page/README for longer discussion. */ if (RecoveryInProgress()) return; What if XLogHintBitIsNeeded is false? That would be the case if we're not wall logging hints *on the standby*. -- Jim Nasby, Data Architect, Blue Treble Consulting Data in Trouble? Get it in Treble! http://BlueTreble.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade and rsync
On 01/22/2015 10:34 PM, Jim Nasby wrote: On 1/22/15 2:19 PM, Heikki Linnakangas wrote: On 01/22/2015 09:20 PM, Bruce Momjian wrote: One question I have is whether hint bits are set by read-only transactions on standby servers. No. See comments in MarkBufferDirtyHint: /* * If we need to protect hint bit updates from torn writes, WAL-log a * full page image of the page. This full page image is only necessary * if the hint bit update is the first change to the page since the * last checkpoint. * * We don't check full_page_writes here because that logic is included * when we call XLogInsert() since the value changes dynamically. */ if (XLogHintBitIsNeeded() (bufHdr-flags BM_PERMANENT)) { /* * If we're in recovery we cannot dirty a page because of a hint. * We can set the hint, just not dirty the page as a result so the * hint is lost when we evict the page or shutdown. * * See src/backend/storage/page/README for longer discussion. */ if (RecoveryInProgress()) return; What if XLogHintBitIsNeeded is false? That would be the case if we're not wall logging hints *on the standby*. Then the page will be updated without writing a WAL record. Just like in the master, if wal_log_hints is off. wal_log_hints works the same on the master or the standby. - Heikki -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Merging postgresql.conf and postgresql.auto.conf
On Wed, Jan 21, 2015 at 1:38 AM, Amit Kapila amit.kapil...@gmail.com wrote: Now as I have suggested upthread, that we can have a new view pg_file_settings which will display information about settings even when there exists multiple entries for the same in different files. I think adding such information to existing view pg_settings would be difficult as the same code is used for show commands which we don't want to change. A new view might work, or we could add columns that contain arrays to the existing view. But a new view may be nicer. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
Here's v23. I reworked a number of things. First, I changed trivial stuff like grouping all the vacuuming options in a struct, to avoid passing an excessive number of arguments to functions. full, freeze, analyze_only, and_analyze and verbose are all in a single struct now. Also, the stage_commands and stage_messages was duplicated by your patch; I moved them to a file-level static struct. I made prepare_command reset the string buffer and receive an optional table name, so that it can append it to the generated command, and append the semicolon as well. Forcing the callers to reset the string before calling, and having them add the table name and semicolon afterwards was awkward and unnecessarily verbose. You had a new in_abort() function in common.c which seems an unnecessary layer; in its place I just exported the inAbort boolean flag it was returning, and renamed to CancelRequested. I was then troubled by the fact that vacuum_one_database() was being called in a loop by main() when multiple tables are vacuumed, but vacuum_parallel() was doing the loop internally. I found this discrepancy confusing, so I renamed that new function to vacuum_one_database_parallel and modified the original vacuum_one_database to do the loop internally as well. Now they are, in essence, a mirror of each other, one doing the parallel stuff and one doing it serially. This seems to make more sense to me -- but see below. I also modified some underlying stuff like GetIdleSlot returning a ParallelSlot pointer instead of an array index. Since its caller always has to dereference the array with the given index, it makes more sense to return the right element pointer instead, so I made it do that. Also, that way, instead of returning NO_SLOT in case of error it can just return NULL; no need for extra cognitive burden. I also changed select_loop. In your patch it had two implementations, one WIN32 and another one for the rest. It looks nicer to me to have only one with small exceptions in the places that need it. (I haven't tested the WIN32 path.) Also, instead of returning ERROR_IN_ABORT I made it set a boolean flag in case of error, which seems cleaner. I changed GetQueryResult as I described in a previous message. There are two things that continue to bother me and I would like you, dear patch author, to change them before committing this patch: 1. I don't like having vacuum_one_database() and a separate vacuum_one_database_parallel(). I think we should merge them into one function, which does either thing according to parameters. There's plenty in there that's duplicated. 2. in particular, the above means that run_parallel_vacuum can no longer exist as it is. Right now vacuum_one_database_parallel relies on run_parallel_vacuum to do the actual job parallellization. I would like to have that looping in the improved vacuum_one_database() function instead. Looking forward to v24, -- Álvaro Herrerahttp://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training Services diff --git a/doc/src/sgml/ref/vacuumdb.sgml b/doc/src/sgml/ref/vacuumdb.sgml index 3ecd999..211235a 100644 --- a/doc/src/sgml/ref/vacuumdb.sgml +++ b/doc/src/sgml/ref/vacuumdb.sgml @@ -204,6 +204,25 @@ PostgreSQL documentation /varlistentry varlistentry + termoption-j replaceable class=parameterjobs/replaceable/option/term + termoption--jobs=replaceable class=parameternjobs/replaceable/option/term + listitem + para +This option will enable the vacuum operation to run on concurrent +connections. Maximum number of tables can be vacuumed concurrently +is equal to number of jobs. If number of jobs given is more than +number of tables then number of jobs will be set to number of tables. + /para + para +applicationvacuumdb/application will open +replaceable class=parameter njobs/replaceable connections to the +database, so make sure your xref linkend=guc-max-connections +setting is high enough to accommodate all connections. + /para + /listitem + /varlistentry + + varlistentry termoption--analyze-in-stages/option/term listitem para diff --git a/src/bin/pg_dump/parallel.c b/src/bin/pg_dump/parallel.c index d942a75..1bf7611 100644 --- a/src/bin/pg_dump/parallel.c +++ b/src/bin/pg_dump/parallel.c @@ -1160,7 +1160,7 @@ select_loop(int maxFd, fd_set *workerset) i = select(maxFd + 1, workerset, NULL, NULL, NULL); /* - * If we Ctrl-C the master process , it's likely that we interrupt + * If we Ctrl-C the master process, it's likely that we interrupt * select() here. The signal handler will set wantAbort == true and * the shutdown journey starts from here. Note that we'll come back * here later when we tell all workers to terminate and read their diff --git a/src/bin/scripts/common.c b/src/bin/scripts/common.c index 6bfe2e6..da142aa
Re: [HACKERS] Windows buildfarm animals are still not happy with abbreviated keys patch
On Thu, Jan 22, 2015 at 12:17 PM, Robert Haas robertmh...@gmail.com wrote: Long story short: this was broken. It may be that when the dust settles we can try re-enabling this on Windows. It might work now that this issue is (hopefully) fixed. Uggh. That still wasn't right; I've pushed commit d060e07fa919e0eb681e2fa2cfbe63d6c40eb2cf to try to fix it. Stay tuned for more exciting dispatches from the department of abbreviated keys! -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Merging postgresql.conf and postgresql.auto.conf
On Wed, Jan 21, 2015 at 11:13 AM, Sawada Masahiko sawada.m...@gmail.com wrote: I think this new view is updated only when postmaster received SIGHUP or is started. And we can have new function like pg_update_file_setting() which updates this view. I really don't think the postmaster should be in the business of trying to publish views of the data. Let the individual backend reveal its own view, and keep the postmaster out of it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Windows buildfarm animals are still not happy with abbreviated keys patch
On Thu, Jan 22, 2015 at 9:55 AM, Robert Haas robertmh...@gmail.com wrote: Stay tuned for more exciting dispatches from the department of abbreviated keys! I certainly suspected that we'd have problems with Windows, but nothing this bad. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] basebackups during ALTER DATABASE ... SET TABLESPACE ... not safe?
On 2015-01-22 14:42:18 -0600, Jim Nasby wrote: On 1/22/15 1:43 PM, Alvaro Herrera wrote: Andres Freund wrote: 2) Make movedb() (and possibly created(), not sure yet) use proper WAL logging and log the whole copied data. I think this is the right long term fix and would end up being much more reliable. But it either requires some uglyness during redo (creating nonexistant database directories on the fly during redo) or new wal records. Doable, but probably too large/invasive to backpatch. Not to mention the extra WAL traffic ... Yeah, I don't know that we actually want #2. It's bad enough to copy an entire database locally The local copy is pretty much fundamental. Given that tablespaces usually will be on different filesystems there's not much else we can do. If we want a optimization for moving databases across tablespaces if both are on the same filesystems and wal_level archive, we can do that. But that's pretty independent. And I doubt it's worthwile the developer time. , but to then put it's entire contents into WAL? Blech. Besides actually having a chance of being correct, doing so will save having to do two checkpoints inside movedb(). I think it's pretty likely that that actually saves overall IO, even including the WAL writes. Especially if there's other databases in the cluster at the same time. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] jsonb, unicode escapes and escaped backslashes
On Wed, Jan 21, 2015 at 06:51:34PM -0500, Andrew Dunstan wrote: The following case has just been brought to my attention (look at the differing number of backslashes): andrew=# select jsonb '\\u'; jsonb -- \u (1 row) andrew=# select jsonb '\u'; jsonb -- \u (1 row) A mess indeed. The input is unambiguous, but the jsonb representation can't distinguish \u from \\u. Some operations on the original json type have similar problems, since they use an in-memory binary representation with the same shortcoming: [local] test=# select json_array_element_text($$[\u]$$, 0) = test-# json_array_element_text($$[\\u]$$, 0); ?column? -- t (1 row) Things get worse, though. On output, '\uabcd' for any four hex digits is recognized as a unicode escape, and thus the backslash is not escaped, so that we get: andrew=# select jsonb '\\uabcd'; jsonb -- \uabcd (1 row) We could probably fix this fairly easily for non- U+ cases by having jsonb_to_cstring use a different escape_json routine. Sounds reasonable. For 9.4.1, before more people upgrade? But it's a mess, sadly, and I'm not sure what a good fix for the U+ case would look like. Agreed. When a string unescape algorithm removes some kinds of backslash escapes and not others, it's nigh inevitable that two semantically-distinct inputs can yield the same output. json_lex_string() fell into that trap by making an exception for \u. To fix this, the result needs to be fully unescaped (\u converted to the NUL byte) or retain all backslash escapes. (Changing that either way is no fun now that an on-disk format is at stake.) Maybe we should detect such input and emit a warning of ambiguity? It's likely to be rare enough, but clearly not as rare as we'd like, since this is a report from the field. Perhaps. Something like WARNING: jsonb cannot represent \\u; reading as \u? Alas, but I do prefer that to silent data corruption. Thanks, nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] B-Tree support function number 3 (strxfrm() optimization)
On 20 January 2015 at 17:10, Peter Geoghegan p...@heroku.com wrote: On Mon, Jan 19, 2015 at 7:47 PM, Michael Paquier michael.paqu...@gmail.com wrote: With your patch applied, the failure with MSVC disappeared, but there is still a warning showing up: (ClCompile target) - src\backend\lib\hyperloglog.c(73): warning C4334: '' : result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended? That seems harmless. I suggest an explicit cast to Size here. This caught my eye too. I agree about casting to Size. Patch attached. Regards David Rowley hyperloglog_bitshift_fix.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers