date:20141228

Re: [HACKERS] PATCH: decreasing memory needlessly consumed by array_agg

2014-12-28 Thread Jeff Davis

On Tue, 2014-04-01 at 13:08 -0400, Tom Lane wrote:
> I think a patch that stood a chance of getting committed would need to
> detect whether the aggregate was being called in simple or grouped
> contexts, and apply different behaviors in the two cases.

The simple context doesn't seem like a big problem even if we change
things as Tomas suggests:

"IMNSHO these are the issues we really should fix - by lowering the
initial element count (64->4) and using a single memory context."

In the simple context, there's only one context regardless, so the only
cost I see is from reducing the initial allocation from 64 to some lower
number. But if we're doubling each time, it won't take long to get
there; and because it's the simple context, we only need to do it once.

Regards,
Jeff Davis

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] ExclusiveLock on extension of relation with huge shared_buffers

2014-12-28 Thread Borodin Vladimir


25 окт. 2014 г., в 4:31, Jim Nasby  написал(а):

> Please don't top-post.
> 
> On 10/24/14, 3:40 AM, Borodin Vladimir wrote:
>> I have taken some backtraces (they are attached to the letter) of two 
>> processes with such command:
>> pid=17981; while true; do date; gdb -batch -e back 
>> /usr/pgsql-9.4/bin/postgres $pid; echo; echo; echo; echo; sleep 0.1; done
>> 
>> Process 17981 was holding the lock for a long time - 
>> http://pastie.org/9671931.
>> And process 13886 was waiting for lock (in different time and from different 
>> blocker actually but I don’t think it is really important) - 
>> http://pastie.org/9671939.
>> 
>> As I can see, 17981 is actually waiting for LWLock on BufFreelistLock in 
>> StrategyGetBuffer function, freelist.c:134 while holding exclusive lock on 
>> relation. I will try to increase NUM_BUFFER_PARTITIONS (on read-only load it 
>> also gave us some performance boost) and write the result in this thread.
> 
> BufFreelistLock becomes very contended when shared buffers are under a lot of 
> pressure.
> 
> Here's what I believe is happening:
> 
> If RelationGetBufferForTuple() decides it needs to extend, this happens:
>   LockRelationForExtension(relation, ExclusiveLock);
>   buffer = ReadBufferBI(relation, P_NEW, bistate);
> 
> Assuming bistate is false (I didn't check the bulk case), ReadBufferBI() ends 
> up at ReadBuffer_common(), which calls BufferAlloc(). In the normal case, 
> BufferAlloc() won't find the necessary buffer, so it will call 
> StrategyGetBuffer(), which will end up getting the freelist lock. Currently 
> the free list is normally empty, which means we now need to run the clock 
> sweep to find a victim buffer. The clock sweep will keep running until it 
> finds a buffer that is not pinned and has usage_count = 0. If shared buffers 
> are under heavy pressure, you can have a huge number of them with usage_count 
> = 5, which for 100GB shared buffers and an 8K BLKSZ, you could have to check 
> buffers *52 million* times (assuming you finally find a buffer on the start 
> of the 5th loop) before you find a victim.
> 
> Keep in mind that's all happening while you're holding both the extension 
> lock *and the freelist lock*, which basically means no one else in the entire 
> system can allocate a new buffer.

I’ll try the same workload with recent patch from Andres Freund [0].

> 
> This is one reason why a large shared_buffers setting is usually 
> counter-productive. Experience with older versions is that setting it higher 
> than about 8GB is more likely to hurt than to help. Newer versions are 
> probably better, but I think you'll be hard-pressed to find a workload where 
> 100GB makes sense. It might if your entire database fits in shared_buffers 
> (though, even then there's probably a number of O(n) or worse operations that 
> will hurt you), but if your database is > shared_buffers you're probably in 
> trouble.
> 
> I suggest cutting shared_buffers *way* down. Old-school advice for this 
> machine would be 8G (since 25% of 128G would be too big). You might be able 
> to do better than 8G, but I recommend not even trying unless you've got a 
> good way to test your performance.
> 
> If you can test performance and find an optimal setting for shared_buffers, 
> please do share your test data and findings. :)

Of course, it works well with shared_buffers <= 8GB. But we have seen that on 
read-only load when data set fits in RAM with <=8GB shared_buffers we hit 
BufFreelistLock LWLock while moving pages between shared buffers and page 
cache. Increasing shared_buffers size to the size of data set improves 
performance up to 2,5X faster on this read-only load. So we started testing 
configuration with huge shared_buffers under writing load and that’s why I 
started this thread.

Since StrategyGetBuffer() does not use BufFreelistLock LWLock any more [1] I’ll 
also re-run tests with read-only load and small shared_buffers.

[0] 
http://git.postgresql.org/pg/commitdiff/d72731a70450b5e7084991b9caa15cb58a2820df
[1] 
http://git.postgresql.org/pg/commitdiff/1dcfb8da09c47d2a7502d1dfab06c8be4b6cf323

> -- 
> Jim Nasby, Data Architect, Blue Treble Consulting
> Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Vladimir

Re: [HACKERS] PATCH: decreasing memory needlessly consumed by array_agg

2014-12-28 Thread Jeff Davis

On Tue, 2014-12-16 at 00:27 +0100, Tomas Vondra wrote:
> > plperl.c: In function 'array_to_datum_internal':
> > plperl.c:1196: error: too few arguments to function 'accumArrayResult'
> > plperl.c: In function 'plperl_array_to_datum':
> > plperl.c:1223: error: too few arguments to function 'initArrayResult'
> > 
> > Cheers,
> 
> Thanks, attached is a version that fixes this.

Just jumping into this patch now. Do we think this is worth changing the
signature of functions in array.h, which might be used from a lot of
third-party code? We might want to provide new functions to avoid a
breaking change.

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PATCH: decreasing memory needlessly consumed by array_agg

2014-12-28 Thread Jeff Davis

On Sun, 2014-12-21 at 13:00 -0500, Tom Lane wrote:
> Tomas Vondra  writes:
> > i.e. either destroy the whole context if possible, and just free the
> > memory when using a shared memory context. But I'm afraid this would
> > penalize the shared memory context, because that's intended for cases
> > where all the build states coexist in parallel and then at some point
> > are all converted into a result and thrown away. Adding pfree() calls is
> > no improvement here, and just wastes cycles.
> 
> FWIW, I quite dislike the terminology "shared memory context", because
> it sounds too much like it means "a context in shared memory".  I see
> that the patch itself doesn't use that phrase, which is good, but can
> we come up with some other phrase for talking about it?
> 

"Common memory context"?

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

2014-12-28 Thread Dilip kumar

On 29 December 2014 10:22 Amit Kapila Wrote,

>> Case1:In Case for CompleteDB:
>>
>> In base code first it will process all the tables in stage 1 then in stage2 
>> and so on, so that at some time all the tables are analyzed at least up to 
>> certain stage.
>>
>> But If we process all the stages for one table first, and then take the 
>> other table for processing the stage 1, then it may happen that for some 
>> table all the stages are processed,
>>
>> but others are waiting for even first stage to be processed, this will 
>> affect the functionality for analyze-in-stages.
>>
>> Case2: In case for independent tables like –t “t1” –t “t2”
>>
> In base code also currently we are processing all the stages for first table 
> and processing same for next table and so on.
>>
>> I think, if user is giving multiple tables together then his purpose might 
>> be to analyze those tables together stage by stage,
>> but in our code we analyze table1 in all stages and then only considering 
>> the next table.
>>
>So basically you want to say that currently the processing for
>tables with --analyze-in-stages switch is different when the user
>executes vacuumdb for whole database versus when it does for
>individual tables (multiple tables together).  In the proposed patch
>the processing for tables will be same for either cases (whole
>database or independent tables).  I think your point has merit, so
>lets proceed with this as it is in your patch.

>Do you have anything more to handle in patch or shall I take one
>another look and pass it to committer if it is ready for the same.

I think nothing more to be handled from my side, you can go ahead with review..

Regards,
Dilip

Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

2014-12-28 Thread Amit Kapila

On Wed, Dec 24, 2014 at 4:00 PM, Dilip kumar  wrote:
>
> Case1:In Case for CompleteDB:
>
> In base code first it will process all the tables in stage 1 then in
stage2 and so on, so that at some time all the tables are analyzed at least
up to certain stage.
>
> But If we process all the stages for one table first, and then take the
other table for processing the stage 1, then it may happen that for some
table all the stages are processed,
>
> but others are waiting for even first stage to be processed, this will
affect the functionality for analyze-in-stages.
>
> Case2: In case for independent tables like –t “t1” –t “t2”
>
> In base code also currently we are processing all the stages for first
table and processing same for next table and so on.
>
> I think, if user is giving multiple tables together then his purpose
might be to analyze those tables together stage by stage,
> but in our code we analyze table1 in all stages and then only considering
the next table.
>

So basically you want to say that currently the processing for
tables with --analyze-in-stages switch is different when the user
executes vacuumdb for whole database versus when it does for
individual tables (multiple tables together).  In the proposed patch
the processing for tables will be same for either cases (whole
database or independent tables).  I think your point has merit, so
lets proceed with this as it is in your patch.

Do you have anything more to handle in patch or shall I take one
another look and pass it to committer if it is ready for the same.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

[HACKERS] psql tab completion: fix COMMENT ON ... IS IS IS

2014-12-28 Thread Ian Barwick


Hi

Currently tab completion for 'COMMENT ON {object} foo IS' will result in the 
'IS'
being duplicated up to two times; not a world-shattering issue I know, but the
fix is trivial and I stumble over it often enough to for it to mildly annoy me.
Patch attached.


Regards

Ian Barwick

--
 Ian Barwick   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
new file mode 100644
index 82c926d..7212015
*** a/src/bin/psql/tab-complete.c
--- b/src/bin/psql/tab-complete.c
*** psql_completion(const char *text, int st
*** 2130,2141 
  	{
  		COMPLETE_WITH_QUERY(Query_for_list_of_event_triggers);
  	}
! 	else if ((pg_strcasecmp(prev4_wd, "COMMENT") == 0 &&
! 			  pg_strcasecmp(prev3_wd, "ON") == 0) ||
! 			 (pg_strcasecmp(prev5_wd, "COMMENT") == 0 &&
! 			  pg_strcasecmp(prev4_wd, "ON") == 0) ||
! 			 (pg_strcasecmp(prev6_wd, "COMMENT") == 0 &&
! 			  pg_strcasecmp(prev5_wd, "ON") == 0))
  		COMPLETE_WITH_CONST("IS");
  
  /* COPY */
--- 2130,2142 
  	{
  		COMPLETE_WITH_QUERY(Query_for_list_of_event_triggers);
  	}
! 	else if (((pg_strcasecmp(prev4_wd, "COMMENT") == 0 &&
! 			   pg_strcasecmp(prev3_wd, "ON") == 0) ||
! 			  (pg_strcasecmp(prev5_wd, "COMMENT") == 0 &&
! 			   pg_strcasecmp(prev4_wd, "ON") == 0) ||
! 			  (pg_strcasecmp(prev6_wd, "COMMENT") == 0 &&
! 			   pg_strcasecmp(prev5_wd, "ON") == 0)) &&
! 			   pg_strcasecmp(prev_wd, "IS") != 0)
  		COMPLETE_WITH_CONST("IS");
  
  /* COPY */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Coverity and pgbench

2014-12-28 Thread Tatsuo Ishii

Hi,

Anybody looks into problems in pgbench pointed out by Coverity? If no,
I would like to work on fixing them because I need to write patches
for "-f option" related issues anyway.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] orangutan seizes up during isolation-check

2014-12-28 Thread Andrew Dunstan



On 12/28/2014 04:58 PM, Noah Misch wrote:

On Sat, Oct 11, 2014 at 09:07:46AM -0400, Peter Eisentraut wrote:

On 10/11/14 1:41 AM, Noah Misch wrote:

Good question.  It would be nice to make the change there, for the benefit of
other consumers.  The patch's setlocale_native_forked() assumes it never runs
in a multithreaded process, but libintl_setlocale() must not assume that.  I
see a few ways libintl/gnulib might proceed:

Yeah, it's difficult to see how they might proceed if they keep calling
into Core Foundation, which might do anything, now or in the future.

I went ahead and submitted a bug report to gettext:
https://savannah.gnu.org/bugs/index.php?43404

(They way I understand it is that the files concerned originate in
gettext and are copied to gnulib.)

Let's see what they say.

The gettext maintainer was open to implementing the setlocale_native_forked()
technique in gettext, though the last visible progress was in October.  In any
event, PostgreSQL builds will see older gettext for several years.  If
setlocale-darwin-fork-v1.patch is not wanted, I suggest making the postmaster
check during startup whether it has become multithreaded.  If multithreaded:

   FATAL: postmaster became multithreaded during startup
   HINT: Set the LC_ALL environment variable to a valid locale.

I wondered whether to downgrade FATAL to LOG in back branches.  Introducing a
new reason to block startup is disruptive for a minor release, but having the
postmaster deadlock at an unpredictable later time is even more disruptive.  I
am inclined to halt startup that way in all branches.


Yeah. It should be easily fixable, AIUI, and startup is surely a good 
and obvious time to to that.





I like the idea of calling pthread_is_threaded_np() as a verification.
This appears to be a OS X-specific function at the moment.  If other
platforms start adding it, then we'll run into the usual problems of how
to link binaries that use pthread functions.  Maybe that's not a
realistic concern.

True.  As written, "configure" will report the function unavailable if it
requires threading libraries.  For a measure that's just a backstop against
other bugs, that may be just right.


I would like to go ahead and commit setlocale-main-harden-v1.patch, which is a
good thing to have regardless of what happens with gettext.



I'm OK with this, but on its own it won't fix orangutan's problems, will it?

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Better way of dealing with pgstat wait timeout during buildfarm runs?

2014-12-28 Thread Robert Haas

On Sat, Dec 27, 2014 at 8:51 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Sat, Dec 27, 2014 at 7:55 PM, Tom Lane  wrote:
>>> This would have the effect of transferring all responsibility for
>>> dead-stats-entry cleanup to autovacuum.  For ordinary users, I think
>>> that'd be just fine.  It might be less fine though for people who
>>> disable autovacuum, if there still are any.
>
>> -1.  I don't think it's a good idea to inflict pain on people who want
>> to schedule their vacuums manually (and yes, there are some) to get
>> clean buildfarm runs.
>
> Did you read the rest of it?

Yeah...  since when do I not read your emails?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] orangutan seizes up during isolation-check

2014-12-28 Thread Noah Misch

On Sat, Oct 11, 2014 at 09:07:46AM -0400, Peter Eisentraut wrote:
> On 10/11/14 1:41 AM, Noah Misch wrote:
> > Good question.  It would be nice to make the change there, for the benefit 
> > of
> > other consumers.  The patch's setlocale_native_forked() assumes it never 
> > runs
> > in a multithreaded process, but libintl_setlocale() must not assume that.  I
> > see a few ways libintl/gnulib might proceed:
> 
> Yeah, it's difficult to see how they might proceed if they keep calling
> into Core Foundation, which might do anything, now or in the future.
> 
> I went ahead and submitted a bug report to gettext:
> https://savannah.gnu.org/bugs/index.php?43404
> 
> (They way I understand it is that the files concerned originate in
> gettext and are copied to gnulib.)
> 
> Let's see what they say.

The gettext maintainer was open to implementing the setlocale_native_forked()
technique in gettext, though the last visible progress was in October.  In any
event, PostgreSQL builds will see older gettext for several years.  If
setlocale-darwin-fork-v1.patch is not wanted, I suggest making the postmaster
check during startup whether it has become multithreaded.  If multithreaded:

  FATAL: postmaster became multithreaded during startup
  HINT: Set the LC_ALL environment variable to a valid locale.

I wondered whether to downgrade FATAL to LOG in back branches.  Introducing a
new reason to block startup is disruptive for a minor release, but having the
postmaster deadlock at an unpredictable later time is even more disruptive.  I
am inclined to halt startup that way in all branches.

> I like the idea of calling pthread_is_threaded_np() as a verification.
> This appears to be a OS X-specific function at the moment.  If other
> platforms start adding it, then we'll run into the usual problems of how
> to link binaries that use pthread functions.  Maybe that's not a
> realistic concern.

True.  As written, "configure" will report the function unavailable if it
requires threading libraries.  For a measure that's just a backstop against
other bugs, that may be just right.

I would like to go ahead and commit setlocale-main-harden-v1.patch, which is a
good thing to have regardless of what happens with gettext.

Thanks,
nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5: Better memory accounting, towards memory-bounded HashAgg

2014-12-28 Thread Jeff Davis

On Sun, 2014-12-28 at 12:37 -0800, Jeff Davis wrote:
> I feel like I made a mistake -- can someone please do a
> sanity check on my numbers?

I forgot to randomize the inputs, which doesn't matter much for hashagg
but does matter for sort. New data script attached. The results are even
*better* for disk-based hashagg than the previous numbers suggest. Here
are some new numbers:

work_mem='1MB':
  sort+group (s)hashagg (s)
   singleton q1   21 10
   singleton q2   12  8
   even q120  7
   even q213  5
   skew q122  6
   skew q216  4

work_mem='4MB':
  sort+group (s)hashagg (s)
   singleton q1   17 10
   singleton q2   11  6
   even q116  7
   even q211  5
   skew q119  6
   skew q213  4

work_mem='16MB':
  sort+group (s)hashagg (s)
   singleton q1   16 11
   singleton q2   11  7
   even q115  8
   even q212  6
   skew q115  6
   skew q212  4

work_mem='64MB':
  sort+group (s)hashagg (s)
   singleton q1   18 12
   singleton q2   13  8
   even q117 10
   even q213  6
   skew q117  6
   skew q214  4

work_mem='256MB':
  sort+group (s)hashagg (s)
   singleton q1   18 12
   singleton q2   14  7
   even q116  9
   even q214  5
   skew q118  6
   skew q213  4

work_mem='512MB':
  sort+group (s)hashagg (s)
   singleton q1   18 12
   singleton q2   14  7
   even q117  9
   even q214  5
   skew q117  6
   skew q213  4

work_mem='2GB':
  sort+group (s)hashagg (s)
   singleton q1   11 11
   singleton q27  6
   even q110  9
   even q2 7  5
   skew q1 7  6
   skew q2 4  4


Regards,
Jeff Davis



hashagg_test_data.sql
Description: application/sql

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5: Memory-bounded HashAgg

2014-12-28 Thread Jeff Davis

On Thu, 2014-12-11 at 02:46 -0800, Jeff Davis wrote:
> On Sun, 2014-08-10 at 14:26 -0700, Jeff Davis wrote:
> > This patch is requires the Memory Accounting patch, or something similar
> > to track memory usage.
> > 
> > The attached patch enables hashagg to spill to disk, which means that
> > hashagg will contain itself to work_mem even if the planner makes a
> > bad misestimate of the cardinality.
> 
> New patch attached. All open items are complete, though the patch may
> have a few rough edges.
> 

This thread got moved over here:

http://www.postgresql.org/message-id/1419326161.24895.13.camel@jeff-desktop

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5: Better memory accounting, towards memory-bounded HashAgg

2014-12-28 Thread Peter Geoghegan

On Sun, Dec 28, 2014 at 12:37 PM, Jeff Davis  wrote:
> Do others have similar numbers? I'm quite surprised at how little
> work_mem seems to matter for these plans (HashJoin might be a different
> story though). I feel like I made a mistake -- can someone please do a
> sanity check on my numbers?

I have seen external sorts that were quicker than internal sorts
before. With my abbreviated key patch, under certain circumstances
external sorts are faster, while presumably the same thing is true of
int4 attribute sorts today. Actually, I saw a 10MB work_mem setting
that was marginally faster than a multi-gigabyte one that fit the
entire sort in memory. It probably has something to do with caching
effects dominating over the expense of more comparisons, since higher
work_mem settings that still resulted in an external sort were slower
than the 10MB setting.

I was surprised by this too, but it has been independently reported by
Jeff Janes.
-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5: Better memory accounting, towards memory-bounded HashAgg

2014-12-28 Thread Jeff Davis

On Tue, 2014-12-23 at 01:16 -0800, Jeff Davis wrote:
> New patch attached (rebased, as well).
> 
> I also see your other message about adding regression testing. I'm
> hesitant to slow down the tests for everyone to run through this code
> path though. Should I add regression tests, and then remove them later
> after we're more comfortable that it works?

Attached are some tests I ran. First, generate the data sets with
hashagg_test_data.sql. Then, do (I used work_mem at default of 4MB):

  set enable_hashagg=false;
  \o /tmp/sort.out
  \i /tmp/hashagg_test.sql
  \o
  set enable_hashagg=true;
  \o /tmp/hash.out
  \i /tmp/hashagg_test.sql

and then diff'd the output to make sure the results are the same (except
the plans, of course). The script loads the results into a temp table,
then sorts it before outputting, to make the test order-independent. I
didn't just add an ORDER BY, because that would change the plan and it
would never use hashagg.

I think that has fairly good coverage of the hashagg code. I used 3
different input data sets, byval and byref types (for group key and
args), and a group aggregate query as well as DISTINCT. Let me know if I
missed something.

I also did some performance comparisons between disk-based sort+group
and disk-based hashagg. The results are quite favorable for hashagg
given the data sets I provided. Simply create the data using
hashagg_test_data.sql (if not already done), set the work_mem to the
value you want to test, and run hashagg_test_perf.sql. It uses EXPLAIN
ANALYZE for the timings.

singleton: 10M groups of 1
even: 1M groups of 10
skew: wildly different group sizes; see data script

q1: group aggregate query
q2: distinct query

The total memory requirements for the test to run without going to disk
ranges from about 100MB (for "even") to about 1GB (for "singleton").
Regardless of work_mem, these all fit in memory on my machine, so they
aren't *really* going to disk. Also note that, because of how the memory
blocks are allocated, and that hashagg waits until memory is exceeded,
then hashagg might use about double work_mem when work_mem is small (the
effect is not important at higher values).

work_mem='1MB':
  sort+group (s)hashagg (s)
   singleton q1   12 10
   singleton q28  7
   even q114  7
   even q210  5
   skew q122  6
   skew q216  4

work_mem='4MB':
  sort+group (s)hashagg (s)
   singleton q1   12 11
   singleton q28  6
   even q112  7
   even q2 9  5
   skew q119  6
   skew q213  3

work_mem='16MB':
  sort+group (s)hashagg (s)
   singleton q1   12 11
   singleton q28  7
   even q114  7
   even q210  5
   skew q115  6
   skew q212  4

work_mem='64MB':
  sort+group (s)hashagg (s)
   singleton q1   13 12
   singleton q29  8
   even q114  8
   even q210  5
   skew q117  6
   skew q213  4

work_mem='256MB':
  sort+group (s)hashagg (s)
   singleton q1   12 12
   singleton q29  8
   even q114  7
   even q211  4
   skew q116  6
   skew q213  4

work_mem='512MB':
  sort+group (s)hashagg (s)
   singleton q1   12 12
   singleton q29  7
   even q114  7
   even q210  4
   skew q116  6
   skew q212  4

work_mem='2GB':
  sort+group (s)hashagg (s)
   singleton q19 12
   singleton q26  6
   even q1 8  7
   even q2 6  4
   skew q1 7  6
   skew q2 5  4

These numbers are great news for disk-based hashagg. It seems to be the
same or better than sort+group in nearly all cases (again, this example
doesn't actually go to disk, so those numbers may come out differently).
Also, the numbers are remarkably stable for varying work_mem for both
plans. That means that it doesn't cost much to keep a lower work_mem as
long as your system has plenty of memory.

Do others have similar numbers? I'm quite surprised at how little
work_mem seems to matter for these plans (HashJo

Re: [HACKERS] Proposal "VACUUM SCHEMA"

2014-12-28 Thread Oskari Saarenmaa

21.12.2014, 18:48, Fabrízio de Royes Mello kirjoitti:
> I work with some customer that have databases with a lot of schemas and
> sometimes we need to run manual VACUUM in one schema, and would be nice
> to have a new option to run vacuum in relations from a specific schema.
> 
> The new syntax could be:
> 
> VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] {  [ table_name ] | SCHEMA
> schema_name }
> 
> Also I'll add a new option to "vacuumdb" client:
> 
> -S, --schema=SCHEMA
> 
> I can work on this feature to 2015/02 CF.
> 
> Thoughts?

This would be useful for ANALYZE to make it easier to run analyze only
for the interesting schemas after a pg_upgrade.  I have a database with
most of the actively used data in the "public" schema and a number of
rarely accessed large logging and archive tables in other schemas.  It'd
be useful to prioritize analyzing the main tables before doing anything
about the rarely used schemas to allow the database to be put back into
production as soon as possible.

/ Oskari



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: add recovery_timeout option to control timeout of restore_command nonzero status code

2014-12-28 Thread Michael Paquier

On Sat, Dec 27, 2014 at 3:42 AM, Alexey Vasiliev  wrote:
> Thanks for suggestions.
>
> Patch updated.

Cool, thanks. I just had an extra look at it.

+This is useful, if I using for restore of wal logs some
+external storage (like AWS S3) and no matter what the slave database
+will lag behind the master. The problem, what for each request to
+AWS S3 need to pay, what is why for N nodes, which try to get next
+wal log each 5 seconds will be bigger price, than for example each
+30 seconds.
I reworked this portion of the docs, it is rather incorrect as the
documentation should not use first-person subjects, and I don't
believe that referencing any commercial products is a good thing in
this context.

+# specifies an optional timeout after nonzero code of restore_command.
+# This can be useful to increase/decrease number of a restore_command calls.
This is still referring to a timeout. That's not good. And the name of
the parameter at the top of this comment block is missing.

+static int restore_command_retry_interval = 5000L;
I think that it would be more adapted to set that to 5000, and
multiply by 1L. I am also wondering about having a better lower bound,
like 100ms to avoid some abuse with this feature in the retries?

+   ereport(ERROR,
+
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+errmsg("\"%s\" must
be bigger zero",
+   "restore_command_retry_interval")));
I'd rather rewrite that to "must have a strictly positive value".

-* Wait for more WAL to
arrive. Time out after 5 seconds,
+* Wait for more WAL to
arrive. Time out after
+*
restore_command_retry_interval (5 seconds by default),
 * like when polling the
archive, to react to a trigger
 * file promptly.
 */
WaitLatch(&XLogCtl->recoveryWakeupLatch,
  WL_LATCH_SET
| WL_TIMEOUT,
- 5000L);
+
restore_command_retry_interval);
I should have noticed earlier, but in its current state your patch
actually does not work. What you are doing here is tuning the time
process waits for WAL from stream. In your case what you want to
control is the retry time for a restore_command in archive recovery,
no?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] recovery_min_apply_delay with a negative value

2014-12-28 Thread Michael Paquier

Hi all,

While reviewing another patch, I have noticed that recovery_min_apply_delay
can have a negative value. And the funny part is that we actually attempt
to apply a delay even in this case, per se this condition
recoveryApplyDelay@xlog.c:
/* nothing to do if no delay configured */
if (recovery_min_apply_delay == 0)
return false;
Shouldn't we simply leave if recovery_min_apply_delay is lower 0, and not
only equal to 0?
Regards,
-- 
Michael

Re: [HACKERS] attaching a process in eclipse

2014-12-28 Thread Craig Ringer

On 12/28/2014 10:18 PM, Ravi Kiran wrote:
> Sir, I followed the instructions in the link which you gave , but this
> time I am getting the following error.
> 
> *Can't find a source file at
> "/build/buildd/eglibc-2.19/socket/../sysdeps/unix/sysv/linux/x86_64/recv.c" *
> *Locate the file or edit the source lookup path to include its location.*
> *
> *
> is the error something related my OS or is it related to postgres. 

I think it's best to discuss this somewhere other than pgsql-hackers, as
this discussion isn't going to be of interest to the majority of people
on this mailing list. What you're asking about now isn't really anything
to do with PostgreSQL at all.

I replied to your Stack Overflow post at
http://stackoverflow.com/q/27676836/398670 .

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] attaching a process in eclipse

2014-12-28 Thread Ravi Kiran

Sir, I followed the instructions in the link which you gave , but this time
I am getting the following error.

*Can't find a source file at
"/build/buildd/eglibc-2.19/socket/../sysdeps/unix/sysv/linux/x86_64/recv.c" *
*Locate the file or edit the source lookup path to include its location.*

is the error something related my OS or is it related to postgres.


Thank you for telling the mistake, I will follow the things which you told
from the next time.

Thank you

On Sun, Dec 28, 2014 at 7:36 PM, Craig Ringer  wrote:

> On 12/28/2014 07:49 PM, Ravi Kiran wrote:
> > Thank you for the response sir, I am running both the eclipse and the
> > client under the same user name which is ravi, I have installed postgres
> > source code under the user ravi not postgres,
>
> It doesn't matter how you installed it. How you *run* it matters. If the
> postgresql processes run as user 'ravi' and so does Eclipse, then your
> problem is the one in the article I already sent you a link to, and you
> should follow those instructions.
>
> > should I change it and
> > work under postgres
>
> No. If you read my explanation you would understand that this would just
> make it worse, because you can't attach to a process under a different
> user id.
>
> > and for that to happen should I uninstall the whole
> > postgres and re install under the new user postgres.
>
> That won't help.
>
> By the way, it looks like you also posted this to Stack Overflow:
>
> http://stackoverflow.com/q/27676836/398670
>
> It's OK to do that, but *please mention you posted somewhere else too*
> or preferably link to the relevant article in the mailing list archives.
>
> Also, when replying to mail on a list, please reply-to-all.
>
> --
>  Craig Ringer   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>

Re: [HACKERS] attaching a process in eclipse

2014-12-28 Thread Craig Ringer

On 12/28/2014 07:49 PM, Ravi Kiran wrote:
> Thank you for the response sir, I am running both the eclipse and the
> client under the same user name which is ravi, I have installed postgres
> source code under the user ravi not postgres,

It doesn't matter how you installed it. How you *run* it matters. If the
postgresql processes run as user 'ravi' and so does Eclipse, then your
problem is the one in the article I already sent you a link to, and you
should follow those instructions.

> should I change it and
> work under postgres

No. If you read my explanation you would understand that this would just
make it worse, because you can't attach to a process under a different
user id.

> and for that to happen should I uninstall the whole
> postgres and re install under the new user postgres.

That won't help.

By the way, it looks like you also posted this to Stack Overflow:

http://stackoverflow.com/q/27676836/398670

It's OK to do that, but *please mention you posted somewhere else too*
or preferably link to the relevant article in the mailing list archives.

Also, when replying to mail on a list, please reply-to-all.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2014-12-28 Thread Michael Paquier

On Fri, Dec 26, 2014 at 4:16 PM, Michael Paquier 
wrote:
> On Fri, Dec 26, 2014 at 3:24 PM, Fujii Masao 
wrote:
>> pglz_compress() and pglz_decompress() still use PGLZ_Header, so the
frontend
>> which uses those functions needs to handle PGLZ_Header. But it basically
should
>> be handled via the varlena macros. That is, the frontend still seems to
need to
>> understand the varlena datatype. I think we should avoid that. Thought?
> Hm, yes it may be wiser to remove it and make the data passed to pglz
> for varlena 8 bytes shorter..

OK, here is the result of this work, made of 3 patches.

The first two patches move pglz stuff to src/common and make it a frontend
utility entirely independent on varlena and its related metadata.
- Patch 1 is a simple move of pglz to src/common, with PGLZ_Header still
present. There is nothing amazing here, and that's the broken version that
has been reverted in 966115c.
- The real stuff comes with patch 2, that implements the removal of
PGLZ_Header, changing the APIs of compression and decompression to pglz to
not have anymore toast metadata, this metadata being now localized in
tuptoaster.c. Note that this patch protects the on-disk format (tested with
pg_upgrade from 9.4 to a patched HEAD server). Here is how the APIs of
compression and decompression look like with this patch, simply performing
operations from a source to a destination:
extern int32 pglz_compress(const char *source, int32 slen, char *dest,
  const PGLZ_Strategy *strategy);
extern int32 pglz_decompress(const char *source, char *dest,
  int32 compressed_size, int32 raw_size);
The return value of those functions is the number of bytes written in the
destination buffer, and 0 if operation failed. This is aimed to make
backend as well more pluggable. The reason why patch 2 exists (it could be
merged with patch 1), is to facilitate the review and the changes made to
pglz to make it an entirely independent facility.

Patch 3 is the FPW compression, changed to fit with those changes. Note
that as PGLZ_Header contains the raw size of the compressed data, and that
it does not exist, it is necessary to store the raw length of the block
image directly in the block image header with 2 additional bytes. Those 2
bytes are used only if wal_compression is set to true thanks to a boolean
flag, so if wal_compression is disabled, the WAL record length is exactly
the same as HEAD, and there is no penalty in the default case. Similarly to
previous patches, the block image is compressed without its hole.

To finish, here are some results using the same test as here with the hack
on getrusage to get the system and user CPU diff on a single backend
execution:
http://www.postgresql.org/message-id/cab7npqsc97o-ue5paxfmukwcxe_jioyxo1m4a0pmnmyqane...@mail.gmail.com
Just as a reminder, this test generated a fixed number of FPWs on a single
backend with fsync and autovacuum disabled with several values of
fillfactor to see the effect of page holes.

  test   | ffactor | user_diff | system_diff | pg_size_pretty
-+-+---+-+
 FPW on  |  50 | 48.823907 |0.737649 | 582 MB
 FPW on  |  20 | 16.135000 |0.764682 | 229 MB
 FPW on  |  10 |  8.521099 |0.751947 | 116 MB
 FPW off |  50 | 29.722793 |1.045577 | 746 MB
 FPW off |  20 | 12.673375 |0.905422 | 293 MB
 FPW off |  10 |  6.723120 |0.779936 | 148 MB
 HEAD|  50 | 30.763136 |1.129822 | 746 MB
 HEAD|  20 | 13.340823 |0.893365 | 293 MB
 HEAD|  10 |  7.267311 |0.909057 | 148 MB
(9 rows)

Results are similar to what has been measured previously, it doesn't hurt
to check again, but roughly the CPU cost is balanced by the WAL record
reduction. There is 0 byte of difference in term of WAL record length
between HEAD this patch when wal_compression = off.

Patches, as well as the test script and the results are attached.
Regards,
-- 
Michael

results.sql
Description: Binary data

test_compress
Description: Binary data

20141228_fpw_compression_v12.tar.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] attaching a process in eclipse

2014-12-28 Thread Craig Ringer

On 12/28/2014 06:37 PM, Ravi Kiran wrote:
> 
> "Could not attach to process.  If your uid matches the uid of the target
> process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try
> again as the root user.  For more details, see /etc/sysctl.d/10-ptrace.conf"
> 
> how do we rectify this error?

Oh, and if you *are* debugging a PostgreSQL process under the same uid
as your own, adjust the ptrace_scope i the yama security module. Like
the error message above tells you to.

Details here:
http://askubuntu.com/questions/41629/after-upgrade-gdb-wont-attach-to-process
which I found by *searching for the error message text*.

BTW, in general I recommend that when you're modifying and debugging
PostgreSQL you run it under your normal user account using 'postgres' or
'pg_ctl', rather than messing with an operating system installation.
It's much easier.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] attaching a process in eclipse

2014-12-28 Thread Craig Ringer

On 12/28/2014 06:37 PM, Ravi Kiran wrote:
> 
> "Could not attach to process.  If your uid matches the uid of the target
> process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try
> again as the root user.  For more details, see /etc/sysctl.d/10-ptrace.conf"
> 
> how do we rectify this error?

At a guess, the PostgreSQL you are trying to attach to is not running
under the same user account as Eclipse.

The Linux kernel only permits you to attach to a process with the same
user account as you by default, unless you are the root user. That's
what "if your uid matches the uid of the target process" refers to.

You'll have the same issue when attaching using gdb directly, this isn't
an Eclipse issue.

If you were debugging manually with gdb you'd often do something like

sudo -u postgres gdb -p thepid

to run gdb as the same user id as the target process. (Avoid running gdb
as root whenever possible). Perhaps Eclipse lets you configure a prefix
command for gdb so you can do that with Eclipse and passwordless sudo?

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] attaching a process in eclipse

2014-12-28 Thread Ravi Kiran

hi,

I am working with postgresql 9.4.0 source using eclipse(indigo version) in
ubuntu 14.04. I am facing a problem of attaching a client process to
postgresql server.

I am following the steps given in this link
https://wiki.postgresql.org/wiki/Working_with_Eclipse#Debugging_PostgreSQL_from_Eclipse

In the debug configuration , I have given the following information in the
fields.

C/C++ Application :- src/backend/postgres
Project :- postgresql-9.4.0

I have found out the process id of the client process using "select
pg_backend_pid()"  and used it to attach the server.

But I am getting the following error whenever I do this

"Could not attach to process.  If your uid matches the uid of the target
process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try
again as the root user.  For more details, see /etc/sysctl.d/10-ptrace.conf"

how do we rectify this error?

Thank you

Regards
K.Ravikiran

Re: [HACKERS] nls and server log

2014-12-28 Thread Craig Ringer

On 12/25/2014 02:35 AM, Euler Taveira wrote:
> Hi,
> 
> Currently the same message goes to server log and client app. Sometimes
> it bothers me since I have to analyze server logs and discovered that
> lc_messages is set to pt_BR and to worse things that stup^H^H^H
> application parse some error messages in portuguese.

IMO logging is simply broken for platforms where the postmaster and all
DBs don't share an encoding. We mix different encodings in log messages
and provide no way to separate them out. Nor is there a way to log
different messages to different files.

It's not just an issue with translations. We mix and mangle encodings of
user-supplied text, like RAISE strings in procs, for example.

We really need to be treating encoding for logging and for the client
much more separately than we currently do. I think any consideration of
translations for logging should be done with the underlying encoding
issues in mind.

My personal opinion is that we should require the server log to be
capable of representing all chars in the encodings used by any DB. Which
in practice means that we always just log in utf-8 if the user wants to
permit DBs with different encodings. An alternative would be one file
per database, always in the encoding of that database.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PATCH: decreasing memory needlessly consumed by array_agg

Re: [HACKERS] ExclusiveLock on extension of relation with huge shared_buffers

Re: [HACKERS] PATCH: decreasing memory needlessly consumed by array_agg

Re: [HACKERS] PATCH: decreasing memory needlessly consumed by array_agg

Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

[HACKERS] psql tab completion: fix COMMENT ON ... IS IS IS

[HACKERS] Coverity and pgbench

Re: [HACKERS] orangutan seizes up during isolation-check

Re: [HACKERS] Better way of dealing with pgstat wait timeout during buildfarm runs?

Re: [HACKERS] orangutan seizes up during isolation-check

Re: [HACKERS] 9.5: Better memory accounting, towards memory-bounded HashAgg

Re: [HACKERS] 9.5: Memory-bounded HashAgg

Re: [HACKERS] 9.5: Better memory accounting, towards memory-bounded HashAgg

Re: [HACKERS] 9.5: Better memory accounting, towards memory-bounded HashAgg

Re: [HACKERS] Proposal "VACUUM SCHEMA"

Re: [HACKERS] Patch: add recovery_timeout option to control timeout of restore_command nonzero status code

[HACKERS] recovery_min_apply_delay with a negative value

Re: [HACKERS] attaching a process in eclipse

Re: [HACKERS] attaching a process in eclipse

Re: [HACKERS] attaching a process in eclipse

Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

Re: [HACKERS] attaching a process in eclipse

Re: [HACKERS] attaching a process in eclipse

[HACKERS] attaching a process in eclipse

Re: [HACKERS] nls and server log

26 matches

Site Navigation

Mail list logo

Footer information