On Wed, Mar 28, 2007 at 02:47:12PM +0900, ITAGAKI Takahiro wrote:
Magnus Hagander [EMAIL PROTECTED] wrote:
IIRC, we're still waiting for performance numbers showing there exists a
win from this patch.
Here is a performance number of Direct I/O support on Windows.
There was 10%+ of
Magnus Hagander [EMAIL PROTECTED] wrote:
IIRC, we're still waiting for performance numbers showing there exists a
win from this patch.
Here is a performance number of Direct I/O support on Windows.
There was 10%+ of performance win on pgbench (263.33 vs. 290.79) in O_DIRECT.
However, I only
On Thu, Jan 11, 2007 at 02:35:13PM -0800, [EMAIL PROTECTED] wrote:
I caught this thread about O_DIRECT on kerneltrap.org:
http://kerneltrap.org/node/7563
It sounds like there is much to be gained here in terms of reducing
the number of user/kernel space copies in the operating system. I
On 1/12/07, Martijn van Oosterhout kleptog@svana.org wrote:
On Thu, Jan 11, 2007 at 02:35:13PM -0800, [EMAIL PROTECTED] wrote:
I caught this thread about O_DIRECT on kerneltrap.org:
http://kerneltrap.org/node/7563
It sounds like there is much to be gained here in terms of reducing
the
I caught this thread about O_DIRECT on kerneltrap.org:
http://kerneltrap.org/node/7563
It sounds like there is much to be gained here in terms of reducing
the number of user/kernel space copies in the operating system. I got
the impression that posix_fadvise in the Linux kernel isn't as good
Greg Stark wrote:
Manfred Spraul [EMAIL PROTECTED] writes:
One problem for WAL is that O_DIRECT would disable the write cache -
each operation would block until the data arrived on disk, and that might block
other backends that try to access WALWriteLock.
Perhaps a dedicated backend that does
DB2 supports cooked and raw file systems - SMS (System Manged Space)
and DMS (Database Managed Space) tablespaces.
The DB2 experience is that DMS tends to outperform SMS but requires
considerable tuning and administrative overhead to see these wins.
--
Pip-pip
Sailesh
My experience with DB2 showed that properly setup DMS tablespaces provided a
significant performance benefit. I have also seen that the average DBA does
not generally understand the data or access patterns in the database. Given
that, they don't correctly setup table spaces in general,
-Original Message-
From: Jordan Henderson [mailto:[EMAIL PROTECTED]
Sent: Thursday, October 30, 2003 4:31 PM
To: [EMAIL PROTECTED]; Doug McNaught
Cc: Christopher Kings-Lynne; PostgreSQL-development
Subject: Re: [HACKERS] O_DIRECT in freebsd
My experience with DB2 showed
Jordan == Jordan Henderson [EMAIL PROTECTED] writes:
Jordan significantly better results. I would not say it requires
Jordan considerable tuning, but an understanding of data, storage
Jordan and access patterns. Additionally, these features did not
Jordan cause our group
Personally, I think it is useful to have features. I quite understand the
difficulties in maintaining some features however. Also having worked on
internals for commercial DB engines, I have specifically how code/data paths
can be shortened. I would not make the choice for someone to be
FreeBSD 4.9 was released today. In the release notes was:
2.2.6 File Systems
A new DIRECTIO kernel option enables support for read operations that
bypass the buffer cache and put data directly into a userland buffer.
This feature requires that the O_DIRECT flag is set on the file
descriptor
Christopher Kings-Lynne [EMAIL PROTECTED] writes:
FreeBSD 4.9 was released today. In the release notes was:
2.2.6 File Systems
A new DIRECTIO kernel option enables support for read operations that
bypass the buffer cache and put data directly into a userland
buffer. This feature
On 29 Oct 2003, Doug McNaught wrote:
Christopher Kings-Lynne [EMAIL PROTECTED] writes:
FreeBSD 4.9 was released today. In the release notes was:
2.2.6 File Systems
A new DIRECTIO kernel option enables support for read operations that
bypass the buffer cache and put data directly
scott.marlowe [EMAIL PROTECTED] writes:
I would think the biggest savings could come from using directIO for
vacuuming, so it doesn't cause the kernel to flush buffers.
Would that be just as hard to implement?
Two words: cache coherency.
-Doug
---(end of
Doug McNaught [EMAIL PROTECTED] writes:
Christopher Kings-Lynne [EMAIL PROTECTED] writes:
A new DIRECTIO kernel option enables support for read operations that
bypass the buffer cache and put data directly into a userland
buffer. This feature requires that the O_DIRECT flag is set on the
file
Tom Lane wrote:
Not for WAL --- we never read the WAL at all in normal operation. (If
it works for writes, then we would want to use it for writing WAL, but
that's not apparent from what Christopher quoted.)
At least under Linux, it works for writes. Oracle uses O_DIRECT to
access (both read
Manfred Spraul [EMAIL PROTECTED] writes:
One problem for WAL is that O_DIRECT would disable the write cache -
each operation would block until the data arrived on disk, and that might block
other backends that try to access WALWriteLock.
Perhaps a dedicated backend that does the writeback
The reason I mention it is that Postgres already supports
O_DIRECT I think on some other platforms (for whatever
reason).
[ sounds of grepping... ] No. The only occurrence of O_DIRECT in the
source tree is in TODO:
* Consider use of open/fcntl(O_DIRECT) to minimize
What you really want is Solaris's free-behind, where it detects if a
scan is exceeding a certain percentage of the OS cache and moves the
pages to the _front_ of the to-be-reused list. I am not sure what other
OS's support this, but we need this on our own buffer manager code as
well.
Our TODO
What you really want is Solaris's free-behind, where it detects if a
scan is exceeding a certain percentage of the OS cache and moves the
pages to the _front_ of the to-be-reused list. I am not sure what
other OS's support this, but we need this on our own buffer manager
code as well.
Our
Sean Chittenden wrote:
What you really want is Solaris's free-behind, where it detects if a
scan is exceeding a certain percentage of the OS cache and moves the
pages to the _front_ of the to-be-reused list. I am not sure what
other OS's support this, but we need this on our own buffer
What you really want is Solaris's free-behind, where it detects
if a scan is exceeding a certain percentage of the OS cache and
moves the pages to the _front_ of the to-be-reused list. I am
not sure what other OS's support this, but we need this on our
own buffer manager code as
Basically, we don't know when we read a buffer whether this is a
read-only or read/write. In fact, we could read it in, and another
backend could write it for us.
The big issue is that when we do a write, we don't wait for it to get to
disk.
It seems to use O_DIRECT, we would have to read the
Basically, we don't know when we read a buffer whether this is a
read-only or read/write. In fact, we could read it in, and another
backend could write it for us.
Um, wait. The cache is shared between backends? I don't think so,
but it shouldn't matter because there has to be a semaphore
Sean Chittenden wrote:
Basically, we don't know when we read a buffer whether this is a
read-only or read/write. In fact, we could read it in, and another
backend could write it for us.
Um, wait. The cache is shared between backends? I don't think so,
but it shouldn't matter because
Bruce Momjian [EMAIL PROTECTED] writes:
Basically, I think we need free-behind rather than O_DIRECT.
There are two separate issues here --- one is what's happening in our
own cache, and one is what's happening in the kernel disk cache.
Implementing our own free-behind code would help in our own
Bruce Momjian [EMAIL PROTECTED] writes:
_That_ is an excellent point. However, do we know at the time we open
the file descriptor if we will be doing this?
We'd have to say on a per-read basis whether we want O_DIRECT or not,
and fd.c would need to provide a suitable file descriptor.
What
Tom Lane wrote:
Bruce Momjian [EMAIL PROTECTED] writes:
Basically, I think we need free-behind rather than O_DIRECT.
There are two separate issues here --- one is what's happening in our
own cache, and one is what's happening in the kernel disk cache.
Implementing our own free-behind code
Tom Lane wrote:
Bruce Momjian [EMAIL PROTECTED] writes:
_That_ is an excellent point. However, do we know at the time we open
the file descriptor if we will be doing this?
We'd have to say on a per-read basis whether we want O_DIRECT or not,
and fd.c would need to provide a suitable file
Basically, I think we need free-behind rather than O_DIRECT.
There are two separate issues here --- one is what's happening in
our own cache, and one is what's happening in the kernel disk cache.
Implementing our own free-behind code would help in our own cache
but does nothing for the
What about cache coherency problems with other backends not
opening with O_DIRECT?
If O_DIRECT introduces cache coherency problems against other
processes not using O_DIRECT then the whole idea is a nonstarter,
but I can't imagine any kernel hackers would have been stupid
enough
Bruce Momjian [EMAIL PROTECTED] writes:
True, it is a cost/benefit issue. My assumption was that once we have
free-behind in the PostgreSQL shared buffer cache, the kernel cache
issues would be minimal, but I am willing to be found wrong.
If you are running on the
Sean Chittenden [EMAIL PROTECTED] writes:
it doesn't seem totally out of the question. I'd kinda like to see
some experimental evidence that it's worth doing though. Anyone
care to make a quick-hack prototype and do some measurements?
What would you like to measure? Overall system
True, it is a cost/benefit issue. My assumption was that once we have
free-behind in the PostgreSQL shared buffer cache, the kernel cache
issues would be minimal, but I am willing to be found wrong.
If you are running on the
small-shared-buffers-and-large-kernel-cache theory, then
Sean Chittenden wrote:
Nor could it ever be a win unless the cache was populated via
O_DIRECT, actually. Big PG cache == 2 extra copies of data, once in
the kernel and once in PG. Doing caching at the kernel level, however
means only one copy of data (for the most part). Only problem with
it doesn't seem totally out of the question. I'd kinda like to
see some experimental evidence that it's worth doing though.
Anyone care to make a quick-hack prototype and do some
measurements?
What would you like to measure? Overall system performance when a
query is using O_DIRECT
Nor could it ever be a win unless the cache was populated via
O_DIRECT, actually. Big PG cache == 2 extra copies of data, once
in the kernel and once in PG. Doing caching at the kernel level,
however means only one copy of data (for the most part). Only
problem with this being that
Sean Chittenden wrote:
Nor could it ever be a win unless the cache was populated via
O_DIRECT, actually. Big PG cache == 2 extra copies of data, once
in the kernel and once in PG. Doing caching at the kernel level,
however means only one copy of data (for the most part). Only
On Wed, Jun 18, 2003 at 10:01:37AM +1000, Gavin Sherry wrote:
On Tue, 17 Jun 2003, Tom Lane wrote:
Christopher Kings-Lynne [EMAIL PROTECTED] writes:
The reason I mention it is that Postgres already supports O_DIRECT I think
on some other platforms (for whatever reason).
[ sounds of
Jim C. Nasby [EMAIL PROTECTED] writes:
DB2 and Oracle, from memory, allow users to pass hints to the planner to
use/not use file system caching.
Might it make sense to do this for on-disk sorts, since sort_mem is
essentially being used as a disk cache (at least for reads)?
If sort_mem were
Also, keep in mind writes to O_DIRECT devices have to wait for the data
to get on the platters rather than into the kernel cache.
---
Tom Lane wrote:
Jim C. Nasby [EMAIL PROTECTED] writes:
DB2 and Oracle, from memory,
I noticed this in the FreeBSD 5.1 release notes:
A new DIRECTIO kernel option enables support for read operations that
bypass the buffer cache and put data directly into a userland buffer. This
feature requires that the O_DIRECT flag is set on the file descriptor and
that both the offset and
I noticed this in the FreeBSD 5.1 release notes:
A new DIRECTIO kernel option enables support for read operations
that bypass the buffer cache and put data directly into a userland
buffer. This feature requires that the O_DIRECT flag is set on the
file descriptor and that both the offset
Will PostgreSQL pick this up automatically, or do we need to add
extra checks?
Extra checks, though I'm not sure why you'd want this. This is the
equiv of a nice way of handling raw IO for read only
operations... which would be bad. Call me crazy, but unless you're on
The reason I
On Tue, 17 Jun 2003, Christopher Kings-Lynne wrote:
A new DIRECTIO kernel option enables support for read operations that
bypass the buffer cache and put data directly into a userland buffer
Will PostgreSQL pick this up automatically, or do we need to add extra
checks?
You don't want it
Christopher Kings-Lynne [EMAIL PROTECTED] writes:
The reason I mention it is that Postgres already supports O_DIRECT I think
on some other platforms (for whatever reason).
[ sounds of grepping... ] No. The only occurrence of O_DIRECT in the
source tree is in TODO:
* Consider use of
On Tue, 17 Jun 2003, Tom Lane wrote:
Christopher Kings-Lynne [EMAIL PROTECTED] writes:
The reason I mention it is that Postgres already supports O_DIRECT I think
on some other platforms (for whatever reason).
[ sounds of grepping... ] No. The only occurrence of O_DIRECT in the
source
On Wed, Jun 18, 2003 at 10:01:37AM +1000, Gavin Sherry wrote:
On Tue, 17 Jun 2003, Tom Lane wrote:
* Consider use of open/fcntl(O_DIRECT) to minimize OS caching
I personally disagree with this TODO item for the same reason Sean
cited: Postgres is designed and tuned to rely on OS-level
The O_DIRECT flag has been added to open(2) and fcntl(2). Specifying
this
flag for open files will attempt to minimize the cache effects of
reading
and writing.
I wonder if using this for WAL would be good.
Not before the code is not optimized to write more than the current 8k
to
the
The O_DIRECT flag has been added in FreeBSD 4.4 (386 Alpha) also. From
the release notes:
Kernel Changes
The O_DIRECT flag has been added to open(2) and fcntl(2). Specifying this
flag for open files will attempt to minimize the cache effects of reading
and writing.
I wonder if using
Well, O_DIRECT has finally made it into the Linux kernel. It lets you
open a file in such a way that reads and writes don't go to the buffer
cache but straight to the disk. Accesses must be aligned on
filesystem block boundaries.
Is there any case where PG would benefit from this? I
The O_DIRECT flag has been added in FreeBSD 4.4 (386 Alpha) also. From
the release notes:
Kernel Changes
The O_DIRECT flag has been added to open(2) and fcntl(2). Specifying this
flag for open files will attempt to minimize the cache effects of reading
and writing.
See:
Well, O_DIRECT has finally made it into the Linux kernel. It lets you
open a file in such a way that reads and writes don't go to the buffer
cache but straight to the disk. Accesses must be aligned on
filesystem block boundaries.
Is there any case where PG would benefit from this? I can see
54 matches
Mail list logo