Merlin Moncure <mmonc...@gmail.com> Tuesday 22 March 2011 23:06:02
> On Tue, Mar 22, 2011 at 4:28 PM, Radosław Smogura
> 
> <rsmog...@softperience.eu> wrote:
> > Merlin Moncure <mmonc...@gmail.com> Monday 21 March 2011 20:58:16
> > 
> >> On Mon, Mar 21, 2011 at 2:08 PM, Greg Stark <gsst...@mit.edu> wrote:
> >> > On Mon, Mar 21, 2011 at 3:54 PM, Merlin Moncure <mmonc...@gmail.com>
> > 
> > wrote:
> >> >> Can't you make just one large mapping and lock it in 8k regions? I
> >> >> thought the problem with mmap was not being able to detect other
> >> >> processes
> >> >> (http://www.mail-archive.com/pgsql-general@postgresql.org/msg122301.h
> >> >> tm l) compatibility issues (possibly obsolete), etc.
> >> > 
> >> > I was assuming that locking part of a mapping would force the kernel
> >> > to split the mapping. It has to record the locked state somewhere so
> >> > it needs a data structure that represents the size of the locked
> >> > section and that would, I assume, be the mapping.
> >> > 
> >> > It's possible the kernel would not in fact fall over too badly doing
> >> > this. At some point I'll go ahead and do experiments on it. It's a bit
> >> > fraught though as it the performance may depend on the memory
> >> > management features of the chipset.
> >> > 
> >> > That said, that's only part of the battle. On 32bit you can't map the
> >> > whole database as your database could easily be larger than your
> >> > address space. I have some ideas on how to tackle that but the
> >> > simplest test would be to just mmap 8kB chunks everywhere.
> >> 
> >> Even on 64 bit systems you only have 48 bit address space which is not
> >> a theoretical  limitation.  However, at least on linux you can map in
> >> and map out pretty quick (10 microseconds paired on my linux vm) so
> >> that's not so big of a deal.  Dealing with rapidly growing files is a
> >> problem.  That said, probably you are not going to want to reserve
> >> multiple gigabytes in 8k non contiguous chunks.
> >> 
> >> > But it's worse than that. Since you're not responsible for flushing
> >> > blocks to disk any longer you need some way to *unlock* a block when
> >> > it's possible to be flushed. That means when you flush the xlog you
> >> > have to somehow find all the blocks that might no longer need to be
> >> > locked and atomically unlock them. That would require new
> >> > infrastructure we don't have though it might not be too hard.
> >> > 
> >> > What would be nice is a mlock_until() where you eventually issue a
> >> > call to tell the kernel what point in time you've reached and it
> >> > unlocks everything older than that time.
> >> 
> >> I wonder if there is any reason to mlock at all...if you are going to
> >> 'do' mmap, can't you just hide under current lock architecture for
> >> actual locking and do direct memory access without mlock?
> >> 
> >> merlin
> > 
> > Actually after dealing with mmap and adding munmap I found crucial thing
> > why to not use mmap:
> > You need to munmap, and for me this takes much time, even if I read with
> > SHARED | PROT_READ, it's looks like Linux do flush or something else,
> > same as with MAP_FIXED, MAP_PRIVATE, etc.
> 
> can you produce small program demonstrating the problem?  This is not
> how things should work AIUI.
> 
> I was thinking about playing with mmap implementation of clog system
> -- it's perhaps better fit.  clog is rigidly defined size, and has
> very high performance requirements.  Also it's much less changes than
> reimplementing heap buffering, and maybe not so much affected by
> munmap.
> 
> merlin

Ah... just one thing, maybe usefull why performance is lost with huge memory. 
I saw mmaped buffers are allocated in something like 0x007, so definitly above 
4gb.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to