Markus Wanner wrote:
Hi,
Martijn van Oosterhout wrote:
And fsync better do what you're asking
(how fast is just a performance issue, just as long as it's done).
Where are we on this issue? I've read all of this thread and the one on
the lvm-linux mailing list as well, but still don't
Hi,
Martijn van Oosterhout wrote:
And fsync better do what you're asking
(how fast is just a performance issue, just as long as it's done).
Where are we on this issue? I've read all of this thread and the one on
the lvm-linux mailing list as well, but still don't feel confident.
In the
On Thu, Mar 19, 2009 at 12:49:52AM +0100, Marco Colombo wrote:
It has to wait for I/O completion on write(), then, it has to go to
sleep. If two different processes do a write(), you don't know which
will be awakened first. Preallocation don't mean much here, since with
O_SYNC you expect a
Martijn van Oosterhout wrote:
True, but the relative wakeup order of two different processes is not
important since by definition they are working on different
transactions. As long as the WAL writes for a single transaction (in a
single process) are not reordered you're fine.
I'm not totally
Ron Mayer wrote:
Marco Colombo wrote:
Yes, but we knew it already, didn't we? It's always been like
that, with IDE disks and write-back cache enabled, fsync just
waits for the disk reporting completion and disks lie about
I've looked hard, and I have yet to see a disk that lies.
No, lie in
Hello,
As a continued follow up to this thread, Tim Post replied on the LVM
list to this affect:
If a logical volume spans physical devices where write caching is
enabled, the results of fsync() can not be trusted. This is an issue
with device mapper, lvm is one of a few possible customers of
Marco Colombo wrote:
Yes, but we knew it already, didn't we? It's always been like
that, with IDE disks and write-back cache enabled, fsync just
waits for the disk reporting completion and disks lie about
I've looked hard, and I have yet to see a disk that lies.
ext3, OTOH seems to lie.
IDE
I am jumping into this thread late, and maybe this has already been
stated clearly, but from my experience benchmarking, LVM does *not*
lie about fsync() on the servers I've configured. An fsync() goes to
the physical device. You can see it clearly by setting the write
cache on the RAID
Marco Colombo wrote:
Ron Mayer wrote:
Greg Smith wrote:
There are some known limitations to Linux fsync that I remain somewhat
concerned about, independantly of LVM, like ext3 fsync() only does a
journal commit when the inode has changed (see
Greg Smith wrote:
On Wed, 18 Mar 2009, Marco Colombo wrote:
If you fsync() after each write you want ordered, there can't be any
subsequent I/O (unless there are many different processes
cuncurrently writing to the file w/o synchronization).
Inside PostgreSQL, each of the database backend
On Wed, Mar 18, 2009 at 10:58:39PM +0100, Marco Colombo wrote:
I hope it's full defence. If you have two processes doing at the
same time write(); fsycn(); on the same file, either there are no order
requirements, or it will boom sooner or later... fsync() works inside
a single process, but
Ron Mayer wrote:
Marco Colombo wrote:
Ron Mayer wrote:
Greg Smith wrote:
There are some known limitations to Linux fsync that I remain somewhat
concerned about, independantly of LVM, like ext3 fsync() only does a
journal commit when the inode has changed (see
On Wed, 18 Mar 2009, Martijn van Oosterhout wrote:
Generally PG uses O_SYNC on open
Only if you change wal_sync_method=open_sync. That's the very last option
PostgreSQL will try--only if none of the other are available will it use
that.
Last time I checked the defaults value for that
Martijn van Oosterhout wrote:
Generally PG uses O_SYNC on open, so it's only one system call, not
two. And the file it's writing to is generally preallocated (not
always though).
It has to wait for I/O completion on write(), then, it has to go to
sleep. If two different processes do a write(),
John R Pierce wrote:
Stefan Kaltenbrunner wrote:
So in my understanding LVM is safe on disks that have write cache
disabled or behave as one (like a controller with a battery backed
cache).
what about drive write caches on battery backed raid controllers? do
the controllers ensure the
On Tue, 17 Mar 2009, Marco Colombo wrote:
If LVM/dm is lying about fsync(), all this is moot. There's no point
talking about disk caches.
I decided to run some tests to see what's going on there, and it looks
like some of my quick criticism of LVM might not actually be valid--it's
only the
Greg Smith wrote:
There are some known limitations to Linux fsync that I remain somewhat
concerned about, independantly of LVM, like ext3 fsync() only does a
journal commit when the inode has changed (see
http://kerneltrap.org/mailarchive/linux-kernel/2008/2/26/990504 ). The
way files are
On Tue, 17 Mar 2009, Ron Mayer wrote:
I wonder if there should be an optional fsync mode
in postgres should turn fsync() into
fchmod (fd, 0644); fchmod (fd, 0664);
to work around this issue.
The test I haven't had time to run yet is to turn the bug exposing program
you were fiddling with
Greg Smith wrote:
On Tue, 17 Mar 2009, Marco Colombo wrote:
If LVM/dm is lying about fsync(), all this is moot. There's no point
talking about disk caches.
I decided to run some tests to see what's going on there, and it looks
like some of my quick criticism of LVM might not actually be
Ron Mayer wrote:
Greg Smith wrote:
There are some known limitations to Linux fsync that I remain somewhat
concerned about, independantly of LVM, like ext3 fsync() only does a
journal commit when the inode has changed (see
http://kerneltrap.org/mailarchive/linux-kernel/2008/2/26/990504 ). The
On Wed, 18 Mar 2009, Marco Colombo wrote:
If you fsync() after each write you want ordered, there can't be any
subsequent I/O (unless there are many different processes cuncurrently
writing to the file w/o synchronization).
Inside PostgreSQL, each of the database backend processes ends up
Tom Lane wrote:
Jack Orenstein jack.orenst...@hds.com writes:
The transaction rates I'm getting seem way too high: 2800-2900 with
one thread, 5000-7000 with ten threads. I'm guessing that writes
aren't really reaching the disk. Can someone suggest how to figure out
where, below postgres,
On Mon, Mar 16, 2009 at 2:03 PM, Stefan Kaltenbrunner
ste...@kaltenbrunner.cc wrote:
So in my understanding LVM is safe on disks that have write cache disabled
or behave as one (like a controller with a battery backed cache).
For storage with write caches it seems to be unsafe, even if the
Stefan Kaltenbrunner wrote:
So in my understanding LVM is safe on disks that have write cache
disabled or behave as one (like a controller with a battery backed
cache).
what about drive write caches on battery backed raid controllers? do
the controllers ensure the drive cache gets flushed
Scott Marlowe wrote:
On Mon, Mar 16, 2009 at 2:03 PM, Stefan Kaltenbrunner
ste...@kaltenbrunner.cc wrote:
So in my understanding LVM is safe on disks that have write cache disabled
or behave as one (like a controller with a battery backed cache).
For storage with write caches it seems to be
Joshua D. Drake wrote:
I understand but disabling cache is not an option for anyone I know. So
I need to know the other :)
Joshua D. Drake
Come on, how many people/organizations do you know who really need 30+ MB/s
sustained write throughtput in the disk subsystem but can't afford a
On Sat, 2009-03-14 at 05:25 +0100, Marco Colombo wrote:
Scott Marlowe wrote:
Also see:
http://lkml.org/lkml/2008/2/26/41
but it seems to me that all this discussion is under the assuption that
disks have write-back caches.
The alternative is to disable the disk write cache. says it all.
If
Joshua D. Drake wrote:
On Sat, 2009-03-14 at 05:25 +0100, Marco Colombo wrote:
Scott Marlowe wrote:
Also see:
http://lkml.org/lkml/2008/2/26/41
but it seems to me that all this discussion is under the assuption that
disks have write-back caches.
The alternative is to disable the disk
On Sun, 2009-03-15 at 01:48 +0100, Marco Colombo wrote:
Joshua D. Drake wrote:
On Sat, 2009-03-14 at 05:25 +0100, Marco Colombo wrote:
Scott Marlowe wrote:
Also see:
http://lkml.org/lkml/2008/2/26/41
but it seems to me that all this discussion is under the assuption that
disks have
Scott Marlowe wrote:
On Fri, Mar 6, 2009 at 2:22 PM, Ben Chobot be...@silentmedia.com wrote:
On Fri, 6 Mar 2009, Greg Smith wrote:
On Fri, 6 Mar 2009, Tom Lane wrote:
Otherwise you need to reconfigure your drive to not cache writes.
I forget the incantation for that but it's in the PG
Marco Colombo pg...@esiway.net writes:
And I'm still wondering. The problem with LVM, AFAIK, is missing support
for write barriers. Once you disable the write-back cache on the disk,
you no longer need write barriers. So I'm missing something, what else
does LVM do to break fsync()?
I think
Tom Lane wrote:
Marco Colombo pg...@esiway.net writes:
And I'm still wondering. The problem with LVM, AFAIK, is missing support
for write barriers. Once you disable the write-back cache on the disk,
you no longer need write barriers. So I'm missing something, what else
does LVM do to break
Marco Colombo pg...@esiway.net writes:
You mean some layer (LVM) is lying about the fsync()?
Got it in one.
regards, tom lane
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
On Fri, 2009-03-13 at 14:00 -0400, Tom Lane wrote:
Marco Colombo pg...@esiway.net writes:
You mean some layer (LVM) is lying about the fsync()?
Got it in one.
I wouldn't think this would be a problem with the proper battery backed
raid controller correct?
Joshua D. Drake
On Fri, 13 Mar 2009, Joshua D. Drake wrote:
On Fri, 2009-03-13 at 14:00 -0400, Tom Lane wrote:
Marco Colombo pg...@esiway.net writes:
You mean some layer (LVM) is lying about the fsync()?
Got it in one.
I wouldn't think this would be a problem with the proper battery backed
raid
On Fri, 2009-03-13 at 11:17 -0700, Ben Chobot wrote:
On Fri, 13 Mar 2009, Joshua D. Drake wrote:
On Fri, 2009-03-13 at 14:00 -0400, Tom Lane wrote:
Marco Colombo pg...@esiway.net writes:
You mean some layer (LVM) is lying about the fsync()?
Got it in one.
I wouldn't think this
On Fri, 13 Mar 2009, Joshua D. Drake wrote:
It seems to me that all you get with a BBU-enabled card is the ability to
get burts of writes out of the OS faster. So you still have the problem,
it's just less like to be encountered.
A BBU controller is about more than that. It is also supposed
On Fri, 2009-03-13 at 11:41 -0700, Ben Chobot wrote:
On Fri, 13 Mar 2009, Joshua D. Drake wrote:
Of course. But if you can't reliably flush the OS buffers (because, say,
you're using LVM so fsync() doesn't work), then you can't say what
actually has made it to the safety of the raid card.
On Fri, 2009-03-13 at 11:41 -0700, Ben Chobot wrote:
On Fri, 13 Mar 2009, Joshua D. Drake wrote:
It seems to me that all you get with a BBU-enabled card is the ability to
get burts of writes out of the OS faster. So you still have the problem,
it's just less like to be encountered.
A
On Mar 13, 2009, at 11:59 AM, Joshua D. Drake wrote:
Wait, actually a good BBU RAID controller will disable the cache on
the
drives. So everything that is cached is already on the controller vs.
the drives itself.
Or am I missing something?
Maybe I'm missing something, but a BBU controller
On Fri, Mar 13, 2009 at 1:09 PM, Christophe x...@thebuild.com wrote:
On Mar 13, 2009, at 11:59 AM, Joshua D. Drake wrote:
Wait, actually a good BBU RAID controller will disable the cache on the
drives. So everything that is cached is already on the controller vs.
the drives itself.
Or am I
Scott Marlowe wrote:
On Fri, Mar 13, 2009 at 1:09 PM, Christophe x...@thebuild.com wrote:
So, if the software calls fsync, but fsync doesn't actually push the data to
the controller, you are still at risk... right?
Ding!
I've been doing some googling, now I'm not sure that not supporting
I'm using postgresql 8.3.6 through JDBC, and trying to measure the maximum
transaction rate on a given Linux box. I wrote a test program that:
- Creates a table with two int columns and no indexes,
- loads the table through a configurable number of threads, with each
transaction writing one
Jack Orenstein jack.orenst...@hds.com writes:
The transaction rates I'm getting seem way too high: 2800-2900 with
one thread, 5000-7000 with ten threads. I'm guessing that writes
aren't really reaching the disk. Can someone suggest how to figure out
where, below postgres, someone is lying
On Fri, 6 Mar 2009, Tom Lane wrote:
Otherwise you need to reconfigure your drive to not cache writes.
I forget the incantation for that but it's in the PG list archives.
There's a dicussion of this in the docs now,
http://www.postgresql.org/docs/8.3/interactive/wal-reliability.html
hdparm
On Fri, 6 Mar 2009, Greg Smith wrote:
On Fri, 6 Mar 2009, Tom Lane wrote:
Otherwise you need to reconfigure your drive to not cache writes.
I forget the incantation for that but it's in the PG list archives.
There's a dicussion of this in the docs now,
On Fri, Mar 6, 2009 at 2:22 PM, Ben Chobot be...@silentmedia.com wrote:
On Fri, 6 Mar 2009, Greg Smith wrote:
On Fri, 6 Mar 2009, Tom Lane wrote:
Otherwise you need to reconfigure your drive to not cache writes.
I forget the incantation for that but it's in the PG list archives.
There's
On Fri, 6 Mar 2009, Ben Chobot wrote:
How does turning off write caching on the disk stop the problem with LVM?
It doesn't. Linux LVM is awful and broken, I was just suggesting more
details on what you still need to check even when it's not involved.
--
* Greg Smith gsm...@gregsmith.com
48 matches
Mail list logo