Merlin Moncure <mmonc...@gmail.com> Tuesday 22 March 2011 23:06:02 > On Tue, Mar 22, 2011 at 4:28 PM, Radosław Smogura > > <rsmog...@softperience.eu> wrote: > > Merlin Moncure <mmonc...@gmail.com> Monday 21 March 2011 20:58:16 > > > >> On Mon, Mar 21, 2011 at 2:08 PM, Greg Stark <gsst...@mit.edu> wrote: > >> > On Mon, Mar 21, 2011 at 3:54 PM, Merlin Moncure <mmonc...@gmail.com> > > > > wrote: > >> >> Can't you make just one large mapping and lock it in 8k regions? I > >> >> thought the problem with mmap was not being able to detect other > >> >> processes > >> >> (http://www.mail-archive.com/pgsql-general@postgresql.org/msg122301.h > >> >> tm l) compatibility issues (possibly obsolete), etc. > >> > > >> > I was assuming that locking part of a mapping would force the kernel > >> > to split the mapping. It has to record the locked state somewhere so > >> > it needs a data structure that represents the size of the locked > >> > section and that would, I assume, be the mapping. > >> > > >> > It's possible the kernel would not in fact fall over too badly doing > >> > this. At some point I'll go ahead and do experiments on it. It's a bit > >> > fraught though as it the performance may depend on the memory > >> > management features of the chipset. > >> > > >> > That said, that's only part of the battle. On 32bit you can't map the > >> > whole database as your database could easily be larger than your > >> > address space. I have some ideas on how to tackle that but the > >> > simplest test would be to just mmap 8kB chunks everywhere. > >> > >> Even on 64 bit systems you only have 48 bit address space which is not > >> a theoretical limitation. However, at least on linux you can map in > >> and map out pretty quick (10 microseconds paired on my linux vm) so > >> that's not so big of a deal. Dealing with rapidly growing files is a > >> problem. That said, probably you are not going to want to reserve > >> multiple gigabytes in 8k non contiguous chunks. > >> > >> > But it's worse than that. Since you're not responsible for flushing > >> > blocks to disk any longer you need some way to *unlock* a block when > >> > it's possible to be flushed. That means when you flush the xlog you > >> > have to somehow find all the blocks that might no longer need to be > >> > locked and atomically unlock them. That would require new > >> > infrastructure we don't have though it might not be too hard. > >> > > >> > What would be nice is a mlock_until() where you eventually issue a > >> > call to tell the kernel what point in time you've reached and it > >> > unlocks everything older than that time. > >> > >> I wonder if there is any reason to mlock at all...if you are going to > >> 'do' mmap, can't you just hide under current lock architecture for > >> actual locking and do direct memory access without mlock? > >> > >> merlin > > > > Actually after dealing with mmap and adding munmap I found crucial thing > > why to not use mmap: > > You need to munmap, and for me this takes much time, even if I read with > > SHARED | PROT_READ, it's looks like Linux do flush or something else, > > same as with MAP_FIXED, MAP_PRIVATE, etc. > > can you produce small program demonstrating the problem? This is not > how things should work AIUI. > > I was thinking about playing with mmap implementation of clog system > -- it's perhaps better fit. clog is rigidly defined size, and has > very high performance requirements. Also it's much less changes than > reimplementing heap buffering, and maybe not so much affected by > munmap. > > merlin
Ah... just one thing, maybe usefull why performance is lost with huge memory. I saw mmaped buffers are allocated in something like 0x007, so definitly above 4gb. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers