Re: [GENERAL] raw partition

2001-08-27 Thread Tom Lane

Martijn van Oosterhout <[EMAIL PROTECTED]> writes:
> You would still however get the advantage that you wouldn't have to copy the
> data from the disk buffers to user space, you simply get the disk buffer
> mapped into your address space.

AFAICS this would be the *only* advantage.  While it's not negligible,
it's quite unclear that it's worth the bookkeeping and portability
headaches of managing lots of mmap'd areas, either.

Before I take this idea seriously at all, I'd want to see a design that
addresses a couple of critical issues:

1. Postgres' shared buffers are *shared*, potentially across many
processes.  How will you deal with buffers for files that have been
mmap'd by only some of the processes?  (Maybe this means that the
whole concept of shared buffers goes away, and each process does its
own buffer management based on its own mmaps.  Not sure.  That would be
a pretty radical restructuring though, and would completely invalidate
our present approach to page-level locking.)

2. How do you deal with extending a file?  My system's mmap man page
says
 If the size of the mapped file changes after the call to mmap(), the
 effect of references to portions of the mapped region that correspond
 to added or removed portions of the file is unspecified.
This suggests that the only portable way to cope is to issue a separate
mmap for every disk page.  Will typical Unix systems perform well with
umpteen thousand small mmap requests?

3. How do you persuade the other backends to drop their mmaps of a table
you are deleting?

There are probably other gotchas, but without an understanding of how
to address these, I doubt it's worth looking further ...

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/users-lounge/docs/faq.html



Re: [GENERAL] raw partition

2001-08-27 Thread Mike Castle

On Tue, Aug 28, 2001 at 12:50:15AM +1000, Andrew Snow wrote:
> Yeah, fair enough.  But mmap works well on the more popular platforms
> used for PostgreSQL.  And it can't *hurt* performance, and its probably

Actually, it CAN hurt performance, even on some of the more popular
platforms.

> worth doing simply so that PostgreSQL "plays nicely" with other
> applications using the VM resources on a particular system, instead of
> the "fixed size buffer cache" approach.

Using mmap() vs a fixed buffer doesn't really make much difference.  The
access patterns are going to be pretty similar in both cases and the level
of paging would be about the same.

mrc
-- 
 Mike Castle  [EMAIL PROTECTED]  www.netcom.com/~dalgoda/
We are all of us living in the shadow of Manhattan.  -- Watchmen
fatal ("You are in a maze of twisty compiler features, all different"); -- gcc

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [GENERAL] raw partition

2001-08-27 Thread Martijn van Oosterhout

On Tue, Aug 28, 2001 at 12:02:08AM +1000, Andrew Snow wrote:
> 
> What I think would be better would be moving postgresql to a system of
> using memory-mapped I/O.  instead of the shared buffer cache, files
> would be directly memory-mapped and the OS would do the caching.  I
> can't see this happening though because of platform dependancy, but I
> think its worth another look soon because many unix platforms support
> mmap().  I think it would improve the performance of disk-intensive
> tasks noticeably.

Well, this has other problems. Consider tables that are larger than your
system memory. You'd have to continuously map and unmap different sections.
That can have odd side effects (witness mozilla on linux having 15,000
mapped areas or so...)

You would still however get the advantage that you wouldn't have to copy the
data from the disk buffers to user space, you simply get the disk buffer
mapped into your address space.

I think that for commonly used tables that are under 100K in size (most of
the system tables), this is quite a workable idea. If you don't mind keeping
them mapped the whole time.

-- 
Martijn van Oosterhout <[EMAIL PROTECTED]>
http://svana.org/kleptog/
> It would be nice if someone came up with a certification system that
> actually separated those who can barely regurgitate what they crammed over
> the last few weeks from those who command secret ninja networking powers.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



RE: [GENERAL] raw partition

2001-08-27 Thread Andrew Snow

> It wouldn't be a very bad idea for systems where mmap is 
> noticeably faster than read/write using syscalls. 
> Unfortunately on some of those systems mmap is broken for 
> multiple processes mapping the same file...:)

Yeah, fair enough.  But mmap works well on the more popular platforms
used for PostgreSQL.  And it can't *hurt* performance, and its probably
worth doing simply so that PostgreSQL "plays nicely" with other
applications using the VM resources on a particular system, instead of
the "fixed size buffer cache" approach.


> But if someone wants to work on it, this would be fairly 
> modest-sized project that only affects bufmgr...

Interesting.. might be worth taking a look at..


- Andrew


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])