Re: [HACKERS] PATCH: decreasing memory needlessly consumed by array_agg
On Tue, 2014-04-01 at 13:08 -0400, Tom Lane wrote: > I think a patch that stood a chance of getting committed would need to > detect whether the aggregate was being called in simple or grouped > contexts, and apply different behaviors in the two cases. The simple context doesn't seem like a big problem even if we change things as Tomas suggests: "IMNSHO these are the issues we really should fix - by lowering the initial element count (64->4) and using a single memory context." In the simple context, there's only one context regardless, so the only cost I see is from reducing the initial allocation from 64 to some lower number. But if we're doubling each time, it won't take long to get there; and because it's the simple context, we only need to do it once. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ExclusiveLock on extension of relation with huge shared_buffers
25 окт. 2014 г., в 4:31, Jim Nasby написал(а): > Please don't top-post. > > On 10/24/14, 3:40 AM, Borodin Vladimir wrote: >> I have taken some backtraces (they are attached to the letter) of two >> processes with such command: >> pid=17981; while true; do date; gdb -batch -e back >> /usr/pgsql-9.4/bin/postgres $pid; echo; echo; echo; echo; sleep 0.1; done >> >> Process 17981 was holding the lock for a long time - >> http://pastie.org/9671931. >> And process 13886 was waiting for lock (in different time and from different >> blocker actually but I don’t think it is really important) - >> http://pastie.org/9671939. >> >> As I can see, 17981 is actually waiting for LWLock on BufFreelistLock in >> StrategyGetBuffer function, freelist.c:134 while holding exclusive lock on >> relation. I will try to increase NUM_BUFFER_PARTITIONS (on read-only load it >> also gave us some performance boost) and write the result in this thread. > > BufFreelistLock becomes very contended when shared buffers are under a lot of > pressure. > > Here's what I believe is happening: > > If RelationGetBufferForTuple() decides it needs to extend, this happens: > LockRelationForExtension(relation, ExclusiveLock); > buffer = ReadBufferBI(relation, P_NEW, bistate); > > Assuming bistate is false (I didn't check the bulk case), ReadBufferBI() ends > up at ReadBuffer_common(), which calls BufferAlloc(). In the normal case, > BufferAlloc() won't find the necessary buffer, so it will call > StrategyGetBuffer(), which will end up getting the freelist lock. Currently > the free list is normally empty, which means we now need to run the clock > sweep to find a victim buffer. The clock sweep will keep running until it > finds a buffer that is not pinned and has usage_count = 0. If shared buffers > are under heavy pressure, you can have a huge number of them with usage_count > = 5, which for 100GB shared buffers and an 8K BLKSZ, you could have to check > buffers *52 million* times (assuming you finally find a buffer on the start > of the 5th loop) before you find a victim. > > Keep in mind that's all happening while you're holding both the extension > lock *and the freelist lock*, which basically means no one else in the entire > system can allocate a new buffer. I’ll try the same workload with recent patch from Andres Freund [0]. > > This is one reason why a large shared_buffers setting is usually > counter-productive. Experience with older versions is that setting it higher > than about 8GB is more likely to hurt than to help. Newer versions are > probably better, but I think you'll be hard-pressed to find a workload where > 100GB makes sense. It might if your entire database fits in shared_buffers > (though, even then there's probably a number of O(n) or worse operations that > will hurt you), but if your database is > shared_buffers you're probably in > trouble. > > I suggest cutting shared_buffers *way* down. Old-school advice for this > machine would be 8G (since 25% of 128G would be too big). You might be able > to do better than 8G, but I recommend not even trying unless you've got a > good way to test your performance. > > If you can test performance and find an optimal setting for shared_buffers, > please do share your test data and findings. :) Of course, it works well with shared_buffers <= 8GB. But we have seen that on read-only load when data set fits in RAM with <=8GB shared_buffers we hit BufFreelistLock LWLock while moving pages between shared buffers and page cache. Increasing shared_buffers size to the size of data set improves performance up to 2,5X faster on this read-only load. So we started testing configuration with huge shared_buffers under writing load and that’s why I started this thread. Since StrategyGetBuffer() does not use BufFreelistLock LWLock any more [1] I’ll also re-run tests with read-only load and small shared_buffers. [0] http://git.postgresql.org/pg/commitdiff/d72731a70450b5e7084991b9caa15cb58a2820df [1] http://git.postgresql.org/pg/commitdiff/1dcfb8da09c47d2a7502d1dfab06c8be4b6cf323 > -- > Jim Nasby, Data Architect, Blue Treble Consulting > Data in Trouble? Get it in Treble! http://BlueTreble.com -- Vladimir
Re: [HACKERS] PATCH: decreasing memory needlessly consumed by array_agg
On Tue, 2014-12-16 at 00:27 +0100, Tomas Vondra wrote: > > plperl.c: In function 'array_to_datum_internal': > > plperl.c:1196: error: too few arguments to function 'accumArrayResult' > > plperl.c: In function 'plperl_array_to_datum': > > plperl.c:1223: error: too few arguments to function 'initArrayResult' > > > > Cheers, > > Thanks, attached is a version that fixes this. Just jumping into this patch now. Do we think this is worth changing the signature of functions in array.h, which might be used from a lot of third-party code? We might want to provide new functions to avoid a breaking change. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: decreasing memory needlessly consumed by array_agg
On Sun, 2014-12-21 at 13:00 -0500, Tom Lane wrote: > Tomas Vondra writes: > > i.e. either destroy the whole context if possible, and just free the > > memory when using a shared memory context. But I'm afraid this would > > penalize the shared memory context, because that's intended for cases > > where all the build states coexist in parallel and then at some point > > are all converted into a result and thrown away. Adding pfree() calls is > > no improvement here, and just wastes cycles. > > FWIW, I quite dislike the terminology "shared memory context", because > it sounds too much like it means "a context in shared memory". I see > that the patch itself doesn't use that phrase, which is good, but can > we come up with some other phrase for talking about it? > "Common memory context"? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
On 29 December 2014 10:22 Amit Kapila Wrote, >> Case1:In Case for CompleteDB: >> >> In base code first it will process all the tables in stage 1 then in stage2 >> and so on, so that at some time all the tables are analyzed at least up to >> certain stage. >> >> But If we process all the stages for one table first, and then take the >> other table for processing the stage 1, then it may happen that for some >> table all the stages are processed, >> >> but others are waiting for even first stage to be processed, this will >> affect the functionality for analyze-in-stages. >> >> Case2: In case for independent tables like –t “t1” –t “t2” >> > In base code also currently we are processing all the stages for first table > and processing same for next table and so on. >> >> I think, if user is giving multiple tables together then his purpose might >> be to analyze those tables together stage by stage, >> but in our code we analyze table1 in all stages and then only considering >> the next table. >> >So basically you want to say that currently the processing for >tables with --analyze-in-stages switch is different when the user >executes vacuumdb for whole database versus when it does for >individual tables (multiple tables together). In the proposed patch >the processing for tables will be same for either cases (whole >database or independent tables). I think your point has merit, so >lets proceed with this as it is in your patch. >Do you have anything more to handle in patch or shall I take one >another look and pass it to committer if it is ready for the same. I think nothing more to be handled from my side, you can go ahead with review.. Regards, Dilip
Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]
On Wed, Dec 24, 2014 at 4:00 PM, Dilip kumar wrote: > > Case1:In Case for CompleteDB: > > In base code first it will process all the tables in stage 1 then in stage2 and so on, so that at some time all the tables are analyzed at least up to certain stage. > > But If we process all the stages for one table first, and then take the other table for processing the stage 1, then it may happen that for some table all the stages are processed, > > but others are waiting for even first stage to be processed, this will affect the functionality for analyze-in-stages. > > Case2: In case for independent tables like –t “t1” –t “t2” > > In base code also currently we are processing all the stages for first table and processing same for next table and so on. > > I think, if user is giving multiple tables together then his purpose might be to analyze those tables together stage by stage, > but in our code we analyze table1 in all stages and then only considering the next table. > So basically you want to say that currently the processing for tables with --analyze-in-stages switch is different when the user executes vacuumdb for whole database versus when it does for individual tables (multiple tables together). In the proposed patch the processing for tables will be same for either cases (whole database or independent tables). I think your point has merit, so lets proceed with this as it is in your patch. Do you have anything more to handle in patch or shall I take one another look and pass it to committer if it is ready for the same. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
[HACKERS] psql tab completion: fix COMMENT ON ... IS IS IS
Hi Currently tab completion for 'COMMENT ON {object} foo IS' will result in the 'IS' being duplicated up to two times; not a world-shattering issue I know, but the fix is trivial and I stumble over it often enough to for it to mildly annoy me. Patch attached. Regards Ian Barwick -- Ian Barwick http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c new file mode 100644 index 82c926d..7212015 *** a/src/bin/psql/tab-complete.c --- b/src/bin/psql/tab-complete.c *** psql_completion(const char *text, int st *** 2130,2141 { COMPLETE_WITH_QUERY(Query_for_list_of_event_triggers); } ! else if ((pg_strcasecmp(prev4_wd, "COMMENT") == 0 && ! pg_strcasecmp(prev3_wd, "ON") == 0) || ! (pg_strcasecmp(prev5_wd, "COMMENT") == 0 && ! pg_strcasecmp(prev4_wd, "ON") == 0) || ! (pg_strcasecmp(prev6_wd, "COMMENT") == 0 && ! pg_strcasecmp(prev5_wd, "ON") == 0)) COMPLETE_WITH_CONST("IS"); /* COPY */ --- 2130,2142 { COMPLETE_WITH_QUERY(Query_for_list_of_event_triggers); } ! else if (((pg_strcasecmp(prev4_wd, "COMMENT") == 0 && ! pg_strcasecmp(prev3_wd, "ON") == 0) || ! (pg_strcasecmp(prev5_wd, "COMMENT") == 0 && ! pg_strcasecmp(prev4_wd, "ON") == 0) || ! (pg_strcasecmp(prev6_wd, "COMMENT") == 0 && ! pg_strcasecmp(prev5_wd, "ON") == 0)) && ! pg_strcasecmp(prev_wd, "IS") != 0) COMPLETE_WITH_CONST("IS"); /* COPY */ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Coverity and pgbench
Hi, Anybody looks into problems in pgbench pointed out by Coverity? If no, I would like to work on fixing them because I need to write patches for "-f option" related issues anyway. Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] orangutan seizes up during isolation-check
On 12/28/2014 04:58 PM, Noah Misch wrote: On Sat, Oct 11, 2014 at 09:07:46AM -0400, Peter Eisentraut wrote: On 10/11/14 1:41 AM, Noah Misch wrote: Good question. It would be nice to make the change there, for the benefit of other consumers. The patch's setlocale_native_forked() assumes it never runs in a multithreaded process, but libintl_setlocale() must not assume that. I see a few ways libintl/gnulib might proceed: Yeah, it's difficult to see how they might proceed if they keep calling into Core Foundation, which might do anything, now or in the future. I went ahead and submitted a bug report to gettext: https://savannah.gnu.org/bugs/index.php?43404 (They way I understand it is that the files concerned originate in gettext and are copied to gnulib.) Let's see what they say. The gettext maintainer was open to implementing the setlocale_native_forked() technique in gettext, though the last visible progress was in October. In any event, PostgreSQL builds will see older gettext for several years. If setlocale-darwin-fork-v1.patch is not wanted, I suggest making the postmaster check during startup whether it has become multithreaded. If multithreaded: FATAL: postmaster became multithreaded during startup HINT: Set the LC_ALL environment variable to a valid locale. I wondered whether to downgrade FATAL to LOG in back branches. Introducing a new reason to block startup is disruptive for a minor release, but having the postmaster deadlock at an unpredictable later time is even more disruptive. I am inclined to halt startup that way in all branches. Yeah. It should be easily fixable, AIUI, and startup is surely a good and obvious time to to that. I like the idea of calling pthread_is_threaded_np() as a verification. This appears to be a OS X-specific function at the moment. If other platforms start adding it, then we'll run into the usual problems of how to link binaries that use pthread functions. Maybe that's not a realistic concern. True. As written, "configure" will report the function unavailable if it requires threading libraries. For a measure that's just a backstop against other bugs, that may be just right. I would like to go ahead and commit setlocale-main-harden-v1.patch, which is a good thing to have regardless of what happens with gettext. I'm OK with this, but on its own it won't fix orangutan's problems, will it? cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Better way of dealing with pgstat wait timeout during buildfarm runs?
On Sat, Dec 27, 2014 at 8:51 PM, Tom Lane wrote: > Robert Haas writes: >> On Sat, Dec 27, 2014 at 7:55 PM, Tom Lane wrote: >>> This would have the effect of transferring all responsibility for >>> dead-stats-entry cleanup to autovacuum. For ordinary users, I think >>> that'd be just fine. It might be less fine though for people who >>> disable autovacuum, if there still are any. > >> -1. I don't think it's a good idea to inflict pain on people who want >> to schedule their vacuums manually (and yes, there are some) to get >> clean buildfarm runs. > > Did you read the rest of it? Yeah... since when do I not read your emails? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] orangutan seizes up during isolation-check
On Sat, Oct 11, 2014 at 09:07:46AM -0400, Peter Eisentraut wrote: > On 10/11/14 1:41 AM, Noah Misch wrote: > > Good question. It would be nice to make the change there, for the benefit > > of > > other consumers. The patch's setlocale_native_forked() assumes it never > > runs > > in a multithreaded process, but libintl_setlocale() must not assume that. I > > see a few ways libintl/gnulib might proceed: > > Yeah, it's difficult to see how they might proceed if they keep calling > into Core Foundation, which might do anything, now or in the future. > > I went ahead and submitted a bug report to gettext: > https://savannah.gnu.org/bugs/index.php?43404 > > (They way I understand it is that the files concerned originate in > gettext and are copied to gnulib.) > > Let's see what they say. The gettext maintainer was open to implementing the setlocale_native_forked() technique in gettext, though the last visible progress was in October. In any event, PostgreSQL builds will see older gettext for several years. If setlocale-darwin-fork-v1.patch is not wanted, I suggest making the postmaster check during startup whether it has become multithreaded. If multithreaded: FATAL: postmaster became multithreaded during startup HINT: Set the LC_ALL environment variable to a valid locale. I wondered whether to downgrade FATAL to LOG in back branches. Introducing a new reason to block startup is disruptive for a minor release, but having the postmaster deadlock at an unpredictable later time is even more disruptive. I am inclined to halt startup that way in all branches. > I like the idea of calling pthread_is_threaded_np() as a verification. > This appears to be a OS X-specific function at the moment. If other > platforms start adding it, then we'll run into the usual problems of how > to link binaries that use pthread functions. Maybe that's not a > realistic concern. True. As written, "configure" will report the function unavailable if it requires threading libraries. For a measure that's just a backstop against other bugs, that may be just right. I would like to go ahead and commit setlocale-main-harden-v1.patch, which is a good thing to have regardless of what happens with gettext. Thanks, nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5: Better memory accounting, towards memory-bounded HashAgg
On Sun, 2014-12-28 at 12:37 -0800, Jeff Davis wrote: > I feel like I made a mistake -- can someone please do a > sanity check on my numbers? I forgot to randomize the inputs, which doesn't matter much for hashagg but does matter for sort. New data script attached. The results are even *better* for disk-based hashagg than the previous numbers suggest. Here are some new numbers: work_mem='1MB': sort+group (s)hashagg (s) singleton q1 21 10 singleton q2 12 8 even q120 7 even q213 5 skew q122 6 skew q216 4 work_mem='4MB': sort+group (s)hashagg (s) singleton q1 17 10 singleton q2 11 6 even q116 7 even q211 5 skew q119 6 skew q213 4 work_mem='16MB': sort+group (s)hashagg (s) singleton q1 16 11 singleton q2 11 7 even q115 8 even q212 6 skew q115 6 skew q212 4 work_mem='64MB': sort+group (s)hashagg (s) singleton q1 18 12 singleton q2 13 8 even q117 10 even q213 6 skew q117 6 skew q214 4 work_mem='256MB': sort+group (s)hashagg (s) singleton q1 18 12 singleton q2 14 7 even q116 9 even q214 5 skew q118 6 skew q213 4 work_mem='512MB': sort+group (s)hashagg (s) singleton q1 18 12 singleton q2 14 7 even q117 9 even q214 5 skew q117 6 skew q213 4 work_mem='2GB': sort+group (s)hashagg (s) singleton q1 11 11 singleton q27 6 even q110 9 even q2 7 5 skew q1 7 6 skew q2 4 4 Regards, Jeff Davis hashagg_test_data.sql Description: application/sql -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5: Memory-bounded HashAgg
On Thu, 2014-12-11 at 02:46 -0800, Jeff Davis wrote: > On Sun, 2014-08-10 at 14:26 -0700, Jeff Davis wrote: > > This patch is requires the Memory Accounting patch, or something similar > > to track memory usage. > > > > The attached patch enables hashagg to spill to disk, which means that > > hashagg will contain itself to work_mem even if the planner makes a > > bad misestimate of the cardinality. > > New patch attached. All open items are complete, though the patch may > have a few rough edges. > This thread got moved over here: http://www.postgresql.org/message-id/1419326161.24895.13.camel@jeff-desktop Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5: Better memory accounting, towards memory-bounded HashAgg
On Sun, Dec 28, 2014 at 12:37 PM, Jeff Davis wrote: > Do others have similar numbers? I'm quite surprised at how little > work_mem seems to matter for these plans (HashJoin might be a different > story though). I feel like I made a mistake -- can someone please do a > sanity check on my numbers? I have seen external sorts that were quicker than internal sorts before. With my abbreviated key patch, under certain circumstances external sorts are faster, while presumably the same thing is true of int4 attribute sorts today. Actually, I saw a 10MB work_mem setting that was marginally faster than a multi-gigabyte one that fit the entire sort in memory. It probably has something to do with caching effects dominating over the expense of more comparisons, since higher work_mem settings that still resulted in an external sort were slower than the 10MB setting. I was surprised by this too, but it has been independently reported by Jeff Janes. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] 9.5: Better memory accounting, towards memory-bounded HashAgg
On Tue, 2014-12-23 at 01:16 -0800, Jeff Davis wrote: > New patch attached (rebased, as well). > > I also see your other message about adding regression testing. I'm > hesitant to slow down the tests for everyone to run through this code > path though. Should I add regression tests, and then remove them later > after we're more comfortable that it works? Attached are some tests I ran. First, generate the data sets with hashagg_test_data.sql. Then, do (I used work_mem at default of 4MB): set enable_hashagg=false; \o /tmp/sort.out \i /tmp/hashagg_test.sql \o set enable_hashagg=true; \o /tmp/hash.out \i /tmp/hashagg_test.sql and then diff'd the output to make sure the results are the same (except the plans, of course). The script loads the results into a temp table, then sorts it before outputting, to make the test order-independent. I didn't just add an ORDER BY, because that would change the plan and it would never use hashagg. I think that has fairly good coverage of the hashagg code. I used 3 different input data sets, byval and byref types (for group key and args), and a group aggregate query as well as DISTINCT. Let me know if I missed something. I also did some performance comparisons between disk-based sort+group and disk-based hashagg. The results are quite favorable for hashagg given the data sets I provided. Simply create the data using hashagg_test_data.sql (if not already done), set the work_mem to the value you want to test, and run hashagg_test_perf.sql. It uses EXPLAIN ANALYZE for the timings. singleton: 10M groups of 1 even: 1M groups of 10 skew: wildly different group sizes; see data script q1: group aggregate query q2: distinct query The total memory requirements for the test to run without going to disk ranges from about 100MB (for "even") to about 1GB (for "singleton"). Regardless of work_mem, these all fit in memory on my machine, so they aren't *really* going to disk. Also note that, because of how the memory blocks are allocated, and that hashagg waits until memory is exceeded, then hashagg might use about double work_mem when work_mem is small (the effect is not important at higher values). work_mem='1MB': sort+group (s)hashagg (s) singleton q1 12 10 singleton q28 7 even q114 7 even q210 5 skew q122 6 skew q216 4 work_mem='4MB': sort+group (s)hashagg (s) singleton q1 12 11 singleton q28 6 even q112 7 even q2 9 5 skew q119 6 skew q213 3 work_mem='16MB': sort+group (s)hashagg (s) singleton q1 12 11 singleton q28 7 even q114 7 even q210 5 skew q115 6 skew q212 4 work_mem='64MB': sort+group (s)hashagg (s) singleton q1 13 12 singleton q29 8 even q114 8 even q210 5 skew q117 6 skew q213 4 work_mem='256MB': sort+group (s)hashagg (s) singleton q1 12 12 singleton q29 8 even q114 7 even q211 4 skew q116 6 skew q213 4 work_mem='512MB': sort+group (s)hashagg (s) singleton q1 12 12 singleton q29 7 even q114 7 even q210 4 skew q116 6 skew q212 4 work_mem='2GB': sort+group (s)hashagg (s) singleton q19 12 singleton q26 6 even q1 8 7 even q2 6 4 skew q1 7 6 skew q2 5 4 These numbers are great news for disk-based hashagg. It seems to be the same or better than sort+group in nearly all cases (again, this example doesn't actually go to disk, so those numbers may come out differently). Also, the numbers are remarkably stable for varying work_mem for both plans. That means that it doesn't cost much to keep a lower work_mem as long as your system has plenty of memory. Do others have similar numbers? I'm quite surprised at how little work_mem seems to matter for these plans (HashJo
Re: [HACKERS] Proposal "VACUUM SCHEMA"
21.12.2014, 18:48, Fabrízio de Royes Mello kirjoitti: > I work with some customer that have databases with a lot of schemas and > sometimes we need to run manual VACUUM in one schema, and would be nice > to have a new option to run vacuum in relations from a specific schema. > > The new syntax could be: > > VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] { [ table_name ] | SCHEMA > schema_name } > > Also I'll add a new option to "vacuumdb" client: > > -S, --schema=SCHEMA > > I can work on this feature to 2015/02 CF. > > Thoughts? This would be useful for ANALYZE to make it easier to run analyze only for the interesting schemas after a pg_upgrade. I have a database with most of the actively used data in the "public" schema and a number of rarely accessed large logging and archive tables in other schemas. It'd be useful to prioritize analyzing the main tables before doing anything about the rarely used schemas to allow the database to be put back into production as soon as possible. / Oskari -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: add recovery_timeout option to control timeout of restore_command nonzero status code
On Sat, Dec 27, 2014 at 3:42 AM, Alexey Vasiliev wrote: > Thanks for suggestions. > > Patch updated. Cool, thanks. I just had an extra look at it. +This is useful, if I using for restore of wal logs some +external storage (like AWS S3) and no matter what the slave database +will lag behind the master. The problem, what for each request to +AWS S3 need to pay, what is why for N nodes, which try to get next +wal log each 5 seconds will be bigger price, than for example each +30 seconds. I reworked this portion of the docs, it is rather incorrect as the documentation should not use first-person subjects, and I don't believe that referencing any commercial products is a good thing in this context. +# specifies an optional timeout after nonzero code of restore_command. +# This can be useful to increase/decrease number of a restore_command calls. This is still referring to a timeout. That's not good. And the name of the parameter at the top of this comment block is missing. +static int restore_command_retry_interval = 5000L; I think that it would be more adapted to set that to 5000, and multiply by 1L. I am also wondering about having a better lower bound, like 100ms to avoid some abuse with this feature in the retries? + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), +errmsg("\"%s\" must be bigger zero", + "restore_command_retry_interval"))); I'd rather rewrite that to "must have a strictly positive value". -* Wait for more WAL to arrive. Time out after 5 seconds, +* Wait for more WAL to arrive. Time out after +* restore_command_retry_interval (5 seconds by default), * like when polling the archive, to react to a trigger * file promptly. */ WaitLatch(&XLogCtl->recoveryWakeupLatch, WL_LATCH_SET | WL_TIMEOUT, - 5000L); + restore_command_retry_interval); I should have noticed earlier, but in its current state your patch actually does not work. What you are doing here is tuning the time process waits for WAL from stream. In your case what you want to control is the retry time for a restore_command in archive recovery, no? -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] recovery_min_apply_delay with a negative value
Hi all, While reviewing another patch, I have noticed that recovery_min_apply_delay can have a negative value. And the funny part is that we actually attempt to apply a delay even in this case, per se this condition recoveryApplyDelay@xlog.c: /* nothing to do if no delay configured */ if (recovery_min_apply_delay == 0) return false; Shouldn't we simply leave if recovery_min_apply_delay is lower 0, and not only equal to 0? Regards, -- Michael
Re: [HACKERS] attaching a process in eclipse
On 12/28/2014 10:18 PM, Ravi Kiran wrote: > Sir, I followed the instructions in the link which you gave , but this > time I am getting the following error. > > *Can't find a source file at > "/build/buildd/eglibc-2.19/socket/../sysdeps/unix/sysv/linux/x86_64/recv.c" * > *Locate the file or edit the source lookup path to include its location.* > * > * > is the error something related my OS or is it related to postgres. I think it's best to discuss this somewhere other than pgsql-hackers, as this discussion isn't going to be of interest to the majority of people on this mailing list. What you're asking about now isn't really anything to do with PostgreSQL at all. I replied to your Stack Overflow post at http://stackoverflow.com/q/27676836/398670 . -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] attaching a process in eclipse
Sir, I followed the instructions in the link which you gave , but this time I am getting the following error. *Can't find a source file at "/build/buildd/eglibc-2.19/socket/../sysdeps/unix/sysv/linux/x86_64/recv.c" * *Locate the file or edit the source lookup path to include its location.* is the error something related my OS or is it related to postgres. Thank you for telling the mistake, I will follow the things which you told from the next time. Thank you On Sun, Dec 28, 2014 at 7:36 PM, Craig Ringer wrote: > On 12/28/2014 07:49 PM, Ravi Kiran wrote: > > Thank you for the response sir, I am running both the eclipse and the > > client under the same user name which is ravi, I have installed postgres > > source code under the user ravi not postgres, > > It doesn't matter how you installed it. How you *run* it matters. If the > postgresql processes run as user 'ravi' and so does Eclipse, then your > problem is the one in the article I already sent you a link to, and you > should follow those instructions. > > > should I change it and > > work under postgres > > No. If you read my explanation you would understand that this would just > make it worse, because you can't attach to a process under a different > user id. > > > and for that to happen should I uninstall the whole > > postgres and re install under the new user postgres. > > That won't help. > > By the way, it looks like you also posted this to Stack Overflow: > > http://stackoverflow.com/q/27676836/398670 > > It's OK to do that, but *please mention you posted somewhere else too* > or preferably link to the relevant article in the mailing list archives. > > Also, when replying to mail on a list, please reply-to-all. > > -- > Craig Ringer http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services >
Re: [HACKERS] attaching a process in eclipse
On 12/28/2014 07:49 PM, Ravi Kiran wrote: > Thank you for the response sir, I am running both the eclipse and the > client under the same user name which is ravi, I have installed postgres > source code under the user ravi not postgres, It doesn't matter how you installed it. How you *run* it matters. If the postgresql processes run as user 'ravi' and so does Eclipse, then your problem is the one in the article I already sent you a link to, and you should follow those instructions. > should I change it and > work under postgres No. If you read my explanation you would understand that this would just make it worse, because you can't attach to a process under a different user id. > and for that to happen should I uninstall the whole > postgres and re install under the new user postgres. That won't help. By the way, it looks like you also posted this to Stack Overflow: http://stackoverflow.com/q/27676836/398670 It's OK to do that, but *please mention you posted somewhere else too* or preferably link to the relevant article in the mailing list archives. Also, when replying to mail on a list, please reply-to-all. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes
On Fri, Dec 26, 2014 at 4:16 PM, Michael Paquier wrote: > On Fri, Dec 26, 2014 at 3:24 PM, Fujii Masao wrote: >> pglz_compress() and pglz_decompress() still use PGLZ_Header, so the frontend >> which uses those functions needs to handle PGLZ_Header. But it basically should >> be handled via the varlena macros. That is, the frontend still seems to need to >> understand the varlena datatype. I think we should avoid that. Thought? > Hm, yes it may be wiser to remove it and make the data passed to pglz > for varlena 8 bytes shorter.. OK, here is the result of this work, made of 3 patches. The first two patches move pglz stuff to src/common and make it a frontend utility entirely independent on varlena and its related metadata. - Patch 1 is a simple move of pglz to src/common, with PGLZ_Header still present. There is nothing amazing here, and that's the broken version that has been reverted in 966115c. - The real stuff comes with patch 2, that implements the removal of PGLZ_Header, changing the APIs of compression and decompression to pglz to not have anymore toast metadata, this metadata being now localized in tuptoaster.c. Note that this patch protects the on-disk format (tested with pg_upgrade from 9.4 to a patched HEAD server). Here is how the APIs of compression and decompression look like with this patch, simply performing operations from a source to a destination: extern int32 pglz_compress(const char *source, int32 slen, char *dest, const PGLZ_Strategy *strategy); extern int32 pglz_decompress(const char *source, char *dest, int32 compressed_size, int32 raw_size); The return value of those functions is the number of bytes written in the destination buffer, and 0 if operation failed. This is aimed to make backend as well more pluggable. The reason why patch 2 exists (it could be merged with patch 1), is to facilitate the review and the changes made to pglz to make it an entirely independent facility. Patch 3 is the FPW compression, changed to fit with those changes. Note that as PGLZ_Header contains the raw size of the compressed data, and that it does not exist, it is necessary to store the raw length of the block image directly in the block image header with 2 additional bytes. Those 2 bytes are used only if wal_compression is set to true thanks to a boolean flag, so if wal_compression is disabled, the WAL record length is exactly the same as HEAD, and there is no penalty in the default case. Similarly to previous patches, the block image is compressed without its hole. To finish, here are some results using the same test as here with the hack on getrusage to get the system and user CPU diff on a single backend execution: http://www.postgresql.org/message-id/cab7npqsc97o-ue5paxfmukwcxe_jioyxo1m4a0pmnmyqane...@mail.gmail.com Just as a reminder, this test generated a fixed number of FPWs on a single backend with fsync and autovacuum disabled with several values of fillfactor to see the effect of page holes. test | ffactor | user_diff | system_diff | pg_size_pretty -+-+---+-+ FPW on | 50 | 48.823907 |0.737649 | 582 MB FPW on | 20 | 16.135000 |0.764682 | 229 MB FPW on | 10 | 8.521099 |0.751947 | 116 MB FPW off | 50 | 29.722793 |1.045577 | 746 MB FPW off | 20 | 12.673375 |0.905422 | 293 MB FPW off | 10 | 6.723120 |0.779936 | 148 MB HEAD| 50 | 30.763136 |1.129822 | 746 MB HEAD| 20 | 13.340823 |0.893365 | 293 MB HEAD| 10 | 7.267311 |0.909057 | 148 MB (9 rows) Results are similar to what has been measured previously, it doesn't hurt to check again, but roughly the CPU cost is balanced by the WAL record reduction. There is 0 byte of difference in term of WAL record length between HEAD this patch when wal_compression = off. Patches, as well as the test script and the results are attached. Regards, -- Michael results.sql Description: Binary data test_compress Description: Binary data 20141228_fpw_compression_v12.tar.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] attaching a process in eclipse
On 12/28/2014 06:37 PM, Ravi Kiran wrote: > > "Could not attach to process. If your uid matches the uid of the target > process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try > again as the root user. For more details, see /etc/sysctl.d/10-ptrace.conf" > > how do we rectify this error? Oh, and if you *are* debugging a PostgreSQL process under the same uid as your own, adjust the ptrace_scope i the yama security module. Like the error message above tells you to. Details here: http://askubuntu.com/questions/41629/after-upgrade-gdb-wont-attach-to-process which I found by *searching for the error message text*. BTW, in general I recommend that when you're modifying and debugging PostgreSQL you run it under your normal user account using 'postgres' or 'pg_ctl', rather than messing with an operating system installation. It's much easier. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] attaching a process in eclipse
On 12/28/2014 06:37 PM, Ravi Kiran wrote: > > "Could not attach to process. If your uid matches the uid of the target > process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try > again as the root user. For more details, see /etc/sysctl.d/10-ptrace.conf" > > how do we rectify this error? At a guess, the PostgreSQL you are trying to attach to is not running under the same user account as Eclipse. The Linux kernel only permits you to attach to a process with the same user account as you by default, unless you are the root user. That's what "if your uid matches the uid of the target process" refers to. You'll have the same issue when attaching using gdb directly, this isn't an Eclipse issue. If you were debugging manually with gdb you'd often do something like sudo -u postgres gdb -p thepid to run gdb as the same user id as the target process. (Avoid running gdb as root whenever possible). Perhaps Eclipse lets you configure a prefix command for gdb so you can do that with Eclipse and passwordless sudo? -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] attaching a process in eclipse
hi, I am working with postgresql 9.4.0 source using eclipse(indigo version) in ubuntu 14.04. I am facing a problem of attaching a client process to postgresql server. I am following the steps given in this link https://wiki.postgresql.org/wiki/Working_with_Eclipse#Debugging_PostgreSQL_from_Eclipse In the debug configuration , I have given the following information in the fields. C/C++ Application :- src/backend/postgres Project :- postgresql-9.4.0 I have found out the process id of the client process using "select pg_backend_pid()" and used it to attach the server. But I am getting the following error whenever I do this "Could not attach to process. If your uid matches the uid of the target process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try again as the root user. For more details, see /etc/sysctl.d/10-ptrace.conf" how do we rectify this error? Thank you Regards K.Ravikiran
Re: [HACKERS] nls and server log
On 12/25/2014 02:35 AM, Euler Taveira wrote: > Hi, > > Currently the same message goes to server log and client app. Sometimes > it bothers me since I have to analyze server logs and discovered that > lc_messages is set to pt_BR and to worse things that stup^H^H^H > application parse some error messages in portuguese. IMO logging is simply broken for platforms where the postmaster and all DBs don't share an encoding. We mix different encodings in log messages and provide no way to separate them out. Nor is there a way to log different messages to different files. It's not just an issue with translations. We mix and mangle encodings of user-supplied text, like RAISE strings in procs, for example. We really need to be treating encoding for logging and for the client much more separately than we currently do. I think any consideration of translations for logging should be done with the underlying encoding issues in mind. My personal opinion is that we should require the server log to be capable of representing all chars in the encodings used by any DB. Which in practice means that we always just log in utf-8 if the user wants to permit DBs with different encodings. An alternative would be one file per database, always in the encoding of that database. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers