Re: [HACKERS] Multiple sorts in a query

2009-05-21 Thread Simon Riggs

On Wed, 2009-05-20 at 23:01 -0400, Zdenek Kotala wrote:

 We use default one. I did not tested difference between them, but IIRC
 that Jignesh did some testing with umem. I will ask him. However if you
 give me test scenario I can test it.

Talk with Dimitri from Sun who is doing scalability benchmarks on
pgsql-perform list now.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multiple sorts in a query

2009-05-20 Thread Simon Riggs

On Tue, 2009-05-19 at 22:19 +0200, Zdenek Kotala wrote:
 Chuck McDevitt píše v út 19. 05. 2009 v 09:33 -0700:

  Solaris default malloc always uses sbrk(), and never ever tried to reduce 
  the sbrk point.
  
  If you want a malloc that uses mmap, there is an non-default malloc that 
  does that (libumem or something?) 
 
 There are severals memory allocator on Solaris. You can choose what you
 need. See
 
 mapalloc (it uses mmap insted of srbk)
 mtmalloc (optimized fro multi threaded apps)
 mumem_alloc
 watchmalloc
 bsdmalloc
  
 maybe more.
 
 What I heart is that standard malloc is not good, but it is still here
 for compatibility reason with old application which depends on some
 functionality.

Which one is used in the default PostgreSQL build for Solaris? If you
use default malloc, have you tested the others and would you recommend
one in particular?

Which one has Dimitri used in his performance testing?

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multiple sorts in a query

2009-05-20 Thread Simon Riggs

On Tue, 2009-05-19 at 16:49 -0400, Greg Stark wrote:

 Well I'm just saying if you realloc a x kilobyte block into a 2x block  
 and the allocator can't expand it and has to copy then it seems  
 inevitable.

OK, understood.

So there is grounds at least for an investigation into how that works
and whether it is as inefficient as we think it might be.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multiple sorts in a query

2009-05-20 Thread Andres Freund

On 05/20/2009 10:14 AM, Simon Riggs wrote:

On Tue, 2009-05-19 at 22:19 +0200, Zdenek Kotala wrote:

Chuck McDevitt píše v út 19. 05. 2009 v 09:33 -0700:
What I heart is that standard malloc is not good, but it is still here
for compatibility reason with old application which depends on some
functionality.


Which one is used in the default PostgreSQL build for Solaris? If you
use default malloc, have you tested the others and would you recommend
one in particular?
You don't even need to recompile it most of the time (unless statically 
compiled or similar things). LD_PRELOAD'ing another malloc library 
should normally be enough.


Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: realloc overhead (was [HACKERS] Multiple sorts in a query)

2009-05-20 Thread pg
 So at least transiently we use 3x the size of the actual array.
 I was conjecturing, prior to investigation. Are you saying you know 
 this/have seen this already?
 Well I'm just saying if you realloc a x kilobyte block into a 2x block and 
 the allocator can't expand it and has to copy then it seems inevitable.

FYI the malloc()/realloc()/free() on FC4 causes memory fragmentation, and thus 
a long-term growth in process memory, under some circumstances. ?This, together 
with the power-of-two allocations in aset.c not accounting for malloc() 
overhead (not that they could), implies that memory contexts can cause 
fragmentation, more slowly, too.

Reallocations of smallish blocks from memory contexts tend to use memory 
already withheld from the OS; a transient increase in memory usage is possible, 
but unlikely to matter. ?Perhaps something should be done about larger blocks.

David Hudson




Re: [HACKERS] Multiple sorts in a query

2009-05-20 Thread Zdenek Kotala

Simon Riggs píše v st 20. 05. 2009 v 09:14 +0100:

  
  What I heart is that standard malloc is not good, but it is still here
  for compatibility reason with old application which depends on some
  functionality.
 
 Which one is used in the default PostgreSQL build for Solaris? If you
 use default malloc, have you tested the others and would you recommend
 one in particular?

We use default one. I did not tested difference between them, but IIRC
that Jignesh did some testing with umem. I will ask him. However if you
give me test scenario I can test it.

Zdenek


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Multiple sorts in a query

2009-05-19 Thread Simon Riggs

Just wanted to check some thoughts about how memory allocation works in
complex queries. Been thinking some more about recent Solaris testing
results that *seemed* to show issues with multiple concurrent queries
that have multiple sorts.

If we have a query that uses multiple sorts, we may have a top-level
sort, with child nodes that contain sorts also. In some cases we may
find with sub-nodes that have both inner and outer sub-trees that
contain sorts also.

If we allocate large chunks of memory we use malloc(). So complex
queries can have multiple mallocs, followed by multiple reallocs. That
in itself seems likely to end up with roughly double memory use, since
realloc won't work properly/quickly with multiple mallocs. (Double since
we allocate X bytes, then 2X bytes etc until we hit the limit.)

When we later free() the memory, do we always free() it in the reverse
order in which it was allocated? If not, how does that effect reducing
the sbrk point, or other aspects of reusing allocated memory?

Is it possible that Solaris's default malloc isn't appropriate for
repeated use in complex queries that use multiple sorts?
http://developers.sun.com/solaris/articles/multiproc/multiproc.html
and recent OpenSolaris bug reports.

Anyway, feel free to jump in.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multiple sorts in a query

2009-05-19 Thread Martijn van Oosterhout
On Tue, May 19, 2009 at 12:32:13PM +0100, Simon Riggs wrote:
 If we allocate large chunks of memory we use malloc(). So complex
 queries can have multiple mallocs, followed by multiple reallocs. That
 in itself seems likely to end up with roughly double memory use, since
 realloc won't work properly/quickly with multiple mallocs. (Double since
 we allocate X bytes, then 2X bytes etc until we hit the limit.)

I don't know about Solaris, but glibc has a threshold above which it
starts using mmap() instead of sbrk(). Thus, once you start using very
large blocks, freeing always returns the memory to the kernel,
irrespective of other allocations.

The threshold is dynamic apparently, but starts at 128KB.

Just a thought,

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 Please line up in a tree and maintain the heap invariant while 
 boarding. Thank you for flying nlogn airlines.


signature.asc
Description: Digital signature


Re: [HACKERS] Multiple sorts in a query

2009-05-19 Thread Simon Riggs

On Tue, 2009-05-19 at 09:17 -0400, Merlin Moncure wrote:
 On Tue, May 19, 2009 at 7:44 AM, Martijn van Oosterhout
 
  The threshold is dynamic apparently, but starts at 128KB.
 
 I just read an article that suggests assuming that can be dangerous
 (by one of the authors of jemalloc)...an interesting read.

 http://www.canonware.com/~ttt/2009/05/mr-malloc-gets-schooled.html

Thanks both, interesting read. Hmmm...

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multiple sorts in a query

2009-05-19 Thread Greg Stark
On Tue, May 19, 2009 at 12:32 PM, Simon Riggs si...@2ndquadrant.com wrote:

 If we have a query that uses multiple sorts, we may have a top-level
 sort, with child nodes that contain sorts also. In some cases we may
 find with sub-nodes that have both inner and outer sub-trees that
 contain sorts also.

Well a top-level sort and a child sort wouldn't both be accumulating
rows at the same time. The child could still be alive behaving like a
tuplestore though.

 If we allocate large chunks of memory we use malloc(). So complex
 queries can have multiple mallocs, followed by multiple reallocs. That
 in itself seems likely to end up with roughly double memory use, since
 realloc won't work properly/quickly with multiple mallocs. (Double since
 we allocate X bytes, then 2X bytes etc until we hit the limit.)

I think it's even worse than that since the old and new allocation
have to briefly coexist. So at least transiently we use 3x the size of
the actual array.

 When we later free() the memory, do we always free() it in the reverse
 order in which it was allocated? If not, how does that effect reducing
 the sbrk point, or other aspects of reusing allocated memory?

 Is it possible that Solaris's default malloc isn't appropriate for
 repeated use in complex queries that use multiple sorts?

Well anything's possible. Do you have any specific ideas? I would
expect any decent malloc library to shrink sbrk based on statically
analyzing where its allocations actually are, so I wouldn't expect the
pattern of frees to matter on that front. It might still fragment
memory if we allocate a bunch of large tuplestore/tuplesorts and then
allocate one object in a longer lived memory context.

What problems have you seen?

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multiple sorts in a query

2009-05-19 Thread Merlin Moncure
On Tue, May 19, 2009 at 7:44 AM, Martijn van Oosterhout
klep...@svana.org wrote:
 On Tue, May 19, 2009 at 12:32:13PM +0100, Simon Riggs wrote:
 If we allocate large chunks of memory we use malloc(). So complex
 queries can have multiple mallocs, followed by multiple reallocs. That
 in itself seems likely to end up with roughly double memory use, since
 realloc won't work properly/quickly with multiple mallocs. (Double since
 we allocate X bytes, then 2X bytes etc until we hit the limit.)

 I don't know about Solaris, but glibc has a threshold above which it
 starts using mmap() instead of sbrk(). Thus, once you start using very
 large blocks, freeing always returns the memory to the kernel,
 irrespective of other allocations.

 The threshold is dynamic apparently, but starts at 128KB.

I just read an article that suggests assuming that can be dangerous
(by one of the authors of jemalloc)...an interesting read.

Update in 2006:
The above was written in 2001. Since then the world has changed a lot.
Memory got bigger. Applications got bigger. The virtual address space
layout in 32 bit linux changed.

In the new situation, brk() and mmap space is shared and there are no
artificial limits on brk size imposed by the kernel. What is more,
applications have started using transient allocations larger than the
128Kb as was imagined in 2001.

http://www.canonware.com/~ttt/2009/05/mr-malloc-gets-schooled.html

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multiple sorts in a query

2009-05-19 Thread Chuck McDevitt

 -Original Message-
 From: pgsql-hackers-ow...@postgresql.org [mailto:pgsql-hackers-
 ow...@postgresql.org] On Behalf Of Simon Riggs
 Sent: Tuesday, May 19, 2009 4:32 AM
 To: pgsql-hackers
 Subject: [HACKERS] Multiple sorts in a query
 
 
 Just wanted to check some thoughts about how memory allocation works in
 complex queries. Been thinking some more about recent Solaris testing
 results that *seemed* to show issues with multiple concurrent queries
 that have multiple sorts.
 
 If we have a query that uses multiple sorts, we may have a top-level
 sort, with child nodes that contain sorts also. In some cases we may
 find with sub-nodes that have both inner and outer sub-trees that
 contain sorts also.
 
 If we allocate large chunks of memory we use malloc(). So complex
 queries can have multiple mallocs, followed by multiple reallocs. That
 in itself seems likely to end up with roughly double memory use, since
 realloc won't work properly/quickly with multiple mallocs. (Double
 since
 we allocate X bytes, then 2X bytes etc until we hit the limit.)
 
 When we later free() the memory, do we always free() it in the reverse
 order in which it was allocated? If not, how does that effect reducing
 the sbrk point, or other aspects of reusing allocated memory?
 
 Is it possible that Solaris's default malloc isn't appropriate for
 repeated use in complex queries that use multiple sorts?
 http://developers.sun.com/solaris/articles/multiproc/multiproc.html
 and recent OpenSolaris bug reports.

Solaris default malloc always uses sbrk(), and never ever tried to reduce the 
sbrk point.

If you want a malloc that uses mmap, there is an non-default malloc that does 
that (libumem or something?)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multiple sorts in a query

2009-05-19 Thread Simon Riggs

On Tue, 2009-05-19 at 09:33 -0700, Chuck McDevitt wrote:

  Is it possible that Solaris's default malloc isn't appropriate for
  repeated use in complex queries that use multiple sorts?
  http://developers.sun.com/solaris/articles/multiproc/multiproc.html
  and recent OpenSolaris bug reports.
 
 Solaris default malloc always uses sbrk(), and never ever tried to
 reduce the sbrk point.
 
 If you want a malloc that uses mmap, there is an non-default malloc
 that does that (libumem or something?)

OK, thanks Chuck. Doesn't sound good.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multiple sorts in a query

2009-05-19 Thread Simon Riggs

On Tue, 2009-05-19 at 13:52 +0100, Greg Stark wrote:

 So at least transiently we use 3x the size of the actual array.

I was conjecturing, prior to investigation. Are you saying you know
this/have seen this already?

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multiple sorts in a query

2009-05-19 Thread Greg Stark
Well I'm just saying if you realloc a x kilobyte block into a 2x block  
and the allocator can't expand it and has to copy then it seems  
inevitable.


--
Greg


On 19 May 2009, at 14:11, Simon Riggs si...@2ndquadrant.com wrote:



On Tue, 2009-05-19 at 13:52 +0100, Greg Stark wrote:


So at least transiently we use 3x the size of the actual array.


I was conjecturing, prior to investigation. Are you saying you know
this/have seen this already?

--
Simon Riggs   www.2ndQuadrant.com
PostgreSQL Training, Services and Support



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multiple sorts in a query

2009-05-19 Thread Zdenek Kotala

Chuck McDevitt píše v út 19. 05. 2009 v 09:33 -0700:

 
 Solaris default malloc always uses sbrk(), and never ever tried to reduce the 
 sbrk point.
 
 If you want a malloc that uses mmap, there is an non-default malloc that does 
 that (libumem or something?)
 

There are severals memory allocator on Solaris. You can choose what you
need. See

mapalloc (it uses mmap insted of srbk)
mtmalloc (optimized fro multi threaded apps)
mumem_alloc
watchmalloc
bsdmalloc
 
maybe more.

What I heart is that standard malloc is not good, but it is still here
for compatibility reason with old application which depends on some
functionality.

Zdenek 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers