Bruce Momjian writes:
> Tom Lane wrote:
>> ISTM Windows' idea of fsync is quite different from Unix's and therefore
>> we should name the wal_sync_method that invokes it something different
>> than fsync. "write_through" or some such?
> Ah, I remember now. On Win32 our fsync is:
> #define
Tom Lane wrote:
> Bruce Momjian writes:
> > Tom Lane wrote:
> >> ISTM Windows' idea of fsync is quite different from Unix's and therefore
> >> we should name the wal_sync_method that invokes it something different
> >> than fsync. "write_through" or some such?
>
> > Ah, I remember now. On Win32
, 2005 10:53 AM
To: Tom Lane
Cc: Magnus Hagander; Michael Paesold; pgsql-hackers@postgresql.org;
[EMAIL PROTECTED]; Merlin Moncure
Subject: Re: [pgsql-hackers-win32] [HACKERS] win32 performance - fsync
question
Tom Lane wrote:
> Bruce Momjian writes:
> > However, I do prefer this patch and
Bruce Momjian writes:
> Tom Lane wrote:
>> we should name the wal_sync_method that invokes it something different
>> than fsync. "write_through" or some such? We already have precedent
>> that not all wal_sync_method values are available on all platforms.
> Yes, I am thinking that too. I hesis
Tom Lane wrote:
> Bruce Momjian writes:
> > However, I do prefer this patch and let Win32 have the same write cache
> > issues as Unix, for consistency.
>
> I agree that the open flag is more nearly O_DSYNC than O_SYNC.
>
> ISTM Windows' idea of fsync is quite different from Unix's and therefore
Bruce Momjian writes:
> However, I do prefer this patch and let Win32 have the same write cache
> issues as Unix, for consistency.
I agree that the open flag is more nearly O_DSYNC than O_SYNC.
ISTM Windows' idea of fsync is quite different from Unix's and therefore
we should name the wal_sync_m
Magnus Hagander wrote:
> > This indicated to me that open_sync did not require any
> > additional changes than our current fsync.
>
> fsync and open_sync both write through the write cache in the operating
> system. Only fsync=off turns this off.
>
> fsync also writes through the hardware write
> > > > * Win32, with fsync, write-cache disabled: no data corruption
> > > > * Win32, with fsync, write-cache enabled: no data corruption
> > > > * Win32, with osync, write cache disabled: no data corruption
> > > > * Win32, with osync, write cache enabled: no data
> corruption. Once
> > > > I
>
Bruce Momjian wrote:
Michael Paesold wrote:
Magnus Hagander wrote:
[snip]
Michael, I am not sure why you come to the conclusion that open_sync
requires turning off the disk write cache. I saw nothing to indicate
that in the thread:
I was just seeing his error message below...
http://archives.postg
Michael Paesold wrote:
> Magnus Hagander wrote:
>
>
> >> Magnus Hagander wrote:
> >>> > Magnus prepared a trivial patch which added the O_SYNC flag
> >>> > for windows and mapped it to FILE_FLAG_WRITE_THROUGH in
> >>> > win32_open.c.
> [snip]
>
> > Michael Paesold wrote:
> >>The original patch d
-Original Message-
From: [EMAIL PROTECTED] on behalf of Bruce Momjian
Sent: Sun 2/27/2005 12:54 AM
To: Magnus Hagander
Cc: Tom Lane; pgsql-hackers@postgresql.org; [EMAIL PROTECTED]; Merlin Moncure
Subject: Re: [pgsql-hackers-win32] [HACKERS] win32 performance - fsync question
> Pa
e; pgsql-hackers@postgresql.org;
>[EMAIL PROTECTED]; Merlin Moncure
>Subject: Re: [pgsql-hackers-win32] [HACKERS] win32 performance
>- fsync question
>
>
>
>Patch applied. Thanks.
>
>I assume this is not approprate for 8.0.X.
>
>-
Magnus Hagander wrote:
Magnus Hagander wrote:
> Magnus prepared a trivial patch which added the O_SYNC flag
> for windows and mapped it to FILE_FLAG_WRITE_THROUGH in
> win32_open.c.
[snip]
Michael Paesold wrote:
The original patch did not have any documentation. Have you
added some? Since this has
Bruce Momjian wrote:
Patch applied. Thanks.
I assume this is not approprate for 8.0.X.
---
Magnus Hagander wrote:
> Magnus prepared a trivial patch which added the O_SYNC flag
> for windows and mapped it to FILE_FLAG_WRITE_THROUGH in
> wi
>> Patch applied. Thanks.
>>
>> I assume this is not approprate for 8.0.X.
>>
>> ---
>>
>>
>> Magnus Hagander wrote:
>>> > Magnus prepared a trivial patch which added the O_SYNC flag
>>> > for windows and mapped it to FILE_FLAG_WRITE_THRO
Patch applied. Thanks.
I assume this is not approprate for 8.0.X.
---
Magnus Hagander wrote:
> > Magnus prepared a trivial patch which added the O_SYNC flag
> > for windows and mapped it to FILE_FLAG_WRITE_THROUGH in
>
> >Are you verifying that all the data that was committed was actually stored?
> >Or
> >just verifying that the database works properly after rebooting?
>
> I verified the data.
Does pg startup increase the xid by some amount (say 1000 xids) after crash ?
Else I think you would also need to rol
"Magnus Hagander" <[EMAIL PROTECTED]> writes:
> > I'm a bit surprised that the write-cache lead to a corrupt database, and
> > not merely lost transactions. I had the impression that drives still
> > handled the writes in the order received.
>
> In this case, it was lost transactions, not data c
>> * Linux, with fsync (default), write-cache enabled: usually no data
>> corruption, but two runs which had
>
>Are you verifying that all the data that was committed was
>actually stored? Or
>just verifying that the database works properly after rebooting?
I verified the data.
>I'm a bit surpr
>> You may find that if you check this case again that the
>"usually no data
>> corruption" is actually "usually lost transactions but no
>corruption".
>
>That's a good point, but it seems difficult to be sure of the last
>reportedly-committed transaction in a powerfail situation. Maybe if
>you
Tom Lane <[EMAIL PROTECTED]> writes:
> Greg Stark <[EMAIL PROTECTED]> writes:
> > I'm a bit surprised that the write-cache lead to a corrupt database, and not
> > merely lost transactions. I had the impression that drives still handled the
> > writes in the order received.
>
> There'd be little
Greg Stark <[EMAIL PROTECTED]> writes:
> I'm a bit surprised that the write-cache lead to a corrupt database, and not
> merely lost transactions. I had the impression that drives still handled the
> writes in the order received.
There'd be little point in having a cache if they did, I should think
"Magnus Hagander" <[EMAIL PROTECTED]> writes:
> * Linux, with fsync (default), write-cache enabled: usually no data
> corruption, but two runs which had
Are you verifying that all the data that was committed was actually stored? Or
just verifying that the database works properly after rebooting?
> > * Win32, with fsync, write-cache disabled: no data corruption
> > * Win32, with fsync, write-cache enabled: no data corruption
> > * Win32, with osync, write cache disabled: no data corruption
> > * Win32, with osync, write cache enabled: no data corruption. Once I
> > got:
> > 2005-02-24 12:19
"Magnus Hagander" <[EMAIL PROTECTED]> writes:
> My results are:
> Fisrt, baseline:
> * Linux, with fsync (default), write-cache disabled: no data corruption
> * Linux, with fsync (default), write-cache enabled: usually no data
> corruption, but two runs which had
That makes sense.
> * Win32, with
My results are:
Fisrt, baseline:
* Linux, with fsync (default), write-cache disabled: no data corruption
* Linux, with fsync (default), write-cache enabled: usually no data
corruption, but two runs which had
* Win32, with fsync, write-cache disabled: no data corruption
* Win32, with fsync, write-ca
In the final test, the BIOS decided the disk was giving up and
reassigned it as 0Mb.. Required two extra cold boots, then it was back
up to 20Gb. Still no data loss.
I think it would be fun to re-run these tests with MySQL...
Chris
---(end of broadcast)--
> > Magnus prepared a trivial patch which added the O_SYNC flag for
> > windows and mapped it to FILE_FLAG_WRITE_THROUGH in win32_open.c.
>
> Attached is this trivial patch. As Merlin says, it needs some
> more reliability testing. But the numbers are at least reasonable - it
> *seems* like it's
> > On win32 (which started this discussion, fsync will sync the
directory
> > entry as well, which will lead to *at least* two seeks on the disk.
> > Writing two blocks after each other to an O_SYNC opened file should
give
> > exactly two seeks.
>
> I think you are making the following not mainta
> >> One point that I no longer recall the reasoning behind is that xlog.c
> >> doesn't think O_SYNC is a preferable default over fsync.
> >
> >For larger (>8k) transactions O_SYNC|O_DIRECT is only good with the recent
> >pending patch to group WAL writes together. The fsync method gives the OS
>> One point that I no longer recall the reasoning behind is that xlog.c
>> doesn't think O_SYNC is a preferable default over fsync.
>
>For larger (>8k) transactions O_SYNC|O_DIRECT is only good
>with the recent
>pending patch to group WAL writes together. The fsync method
>gives the OS a
>cha
>> Portability, or rather the complete lack of it. Stuff that
>isn't in the
>> Single Unix Spec is a hard sell.
>
>O_DIRECT is reasonably common among modern Unixen (it is supported by
>Linux, FreeBSD, and probably a couple of the commercial variants like
>AIX or IRIX); it should also be reason
>> Magnus prepared a trivial patch which added the O_SYNC flag
>> for windows and mapped it to FILE_FLAG_WRITE_THROUGH in
>> win32_open.c.
>
>Attached is this trivial patch. As Merlin says, it needs some more
>reliability testing. But the numbers are at least reasonable - it
>*seems* like it's d
> Magnus prepared a trivial patch which added the O_SYNC flag
> for windows and mapped it to FILE_FLAG_WRITE_THROUGH in
> win32_open.c.
Attached is this trivial patch. As Merlin says, it needs some more
reliability testing. But the numbers are at least reasonable - it
*seems* like it's doing th
Magnus prepared a trivial patch which added the O_SYNC flag for windows
and mapped it to FILE_FLAG_WRITE_THROUGH in win32_open.c. We pg_benched
it and here are the results of our test on my WinXP workstation on a 10k
raptor:
Settings were pgbench -t 100 -c 10.
fsync = off:
~ 280 tps
fsync on,
> One point that I no longer recall the reasoning behind is that xlog.c
> doesn't think O_SYNC is a preferable default over fsync.
For larger (>8k) transactions O_SYNC|O_DIRECT is only good with the recent
pending patch to group WAL writes together. The fsync method gives the OS a
chance to do
Tom Lane wrote:
Portability, or rather the complete lack of it. Stuff that isn't in the
Single Unix Spec is a hard sell.
O_DIRECT is reasonably common among modern Unixen (it is supported by
Linux, FreeBSD, and probably a couple of the commercial variants like
AIX or IRIX); it should also be rea
Evgeny Rodichev <[EMAIL PROTECTED]> writes:
> No, it does. Let's try the simplest test:
>
> for (i = 0; i < LEN; i++) {
> write (fd, buf, 512);
> if (sync) fsync (fd);
> }
>
> with sync = 0 and 1, and you'll see the difference.
Uh, I'm sure you'll see a difference, one will be limited
""Magnus Hagander"" <[EMAIL PROTECTED]>
news:[EMAIL PROTECTED]
>
> This is what we have discovered. AFAIK, all other major databases or
> other similar apps (like exchange or AD) all open files with
> FILE_FLAG_WRITE_THROUGH and do *not* use fsync. It might give noticably
> better performance with
On Fri, 17 Feb 2005, Greg Stark wrote:
Oliver Jowett <[EMAIL PROTECTED]> writes:
So Linux is indeed doing a cache flush on fsync
Actually I think the root of the problem was precisely that Linux does not
issue any sort of cache flush commands to drives on fsync.
No, it does. Let's try the simplest
On Fri, 18 Feb 2005, Oliver Jowett wrote:
Evgeny Rodichev wrote:
Write cache is enabled under Linux by default all the time I make deal
with it (since 1993).
It doesn't interfere with fsync(), as linux kernel uses cache flush for
fsync.
The problem is that most IDE drives lie (or perhaps you could
On Thu, 17 Feb 2005, Tom Lane wrote:
Evgeny Rodichev <[EMAIL PROTECTED]> writes:
Any claimed TPS rate exceeding your disk drive's rotation rate is a
red flag.
Write cache is enabled under Linux by default all the time I make deal
with it (since 1993).
You're playing with fire.
Yes. I'm lucky in th
Greg Stark wrote:
Oliver Jowett <[EMAIL PROTECTED]> writes:
So Linux is indeed doing a cache flush on fsync
Actually I think the root of the problem was precisely that Linux does not
issue any sort of cache flush commands to drives on fsync. There was some talk
on linux-kernel of what how they co
Oliver Jowett <[EMAIL PROTECTED]> writes:
> So Linux is indeed doing a cache flush on fsync
Actually I think the root of the problem was precisely that Linux does not
issue any sort of cache flush commands to drives on fsync. There was some talk
on linux-kernel of what how they could take advant
> "Magnus Hagander" <[EMAIL PROTECTED]> writes:
> > Is there actually a reason why we don't use O_DIRECT on Unix?
>
> Portability, or rather the complete lack of it. Stuff that isn't in
the
> Single Unix Spec is a hard sell.
Well, how about this (ok, maybe I'm way out in left field):
Change fsyn
Evgeny Rodichev wrote:
Write cache is enabled under Linux by default all the time I make deal
with it (since 1993).
It doesn't interfere with fsync(), as linux kernel uses cache flush for
fsync.
The problem is that most IDE drives lie (or perhaps you could say the
specification is ambiguous) about
"Magnus Hagander" <[EMAIL PROTECTED]> writes:
> Is there actually a reason why we don't use O_DIRECT on Unix?
Portability, or rather the complete lack of it. Stuff that isn't in the
Single Unix Spec is a hard sell.
regards, tom lane
---(end of bro
>After multiple runs on different blocksizes( a few anomalous results
>aside), I didn't see a whole lot of difference between
>FILE_FLAG_NO_BUFFERING being on or off for writing performance.
>However, with NO_BUFFERING set, the file is not *read* cached at all.
>While the performance is on not terr
Evgeny Rodichev <[EMAIL PROTECTED]> writes:
>> Any claimed TPS rate exceeding your disk drive's rotation rate is a
>> red flag.
> Write cache is enabled under Linux by default all the time I make deal
> with it (since 1993).
You're playing with fire.
> fsync() really works fine as I switch off m
> "Magnus Hagander" <[EMAIL PROTECTED]> writes:
> > Tom, if you look at all the requirements of FILE_FLAG_NO_BUFFERING
on
> >
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/fileio/
> > base/createfile.asp, can you say offhand if the WAL code fulfills
them?
>
> If I'm reading it r
On Thu, 17 Feb 2005, Tom Lane wrote:
Christopher Kings-Lynne <[EMAIL PROTECTED]> writes:
WinXP fsync = true 20-28 tps
WinXP fsync = false 600 tps
Linux fsync = true 800 tps
Linux fsync = false 980 tps
Wow, that's terrible on Windows. If there's a solution, it'd be nice to
"Magnus Hagander" <[EMAIL PROTECTED]> writes:
> Tom, if you look at all the requirements of FILE_FLAG_NO_BUFFERING on
> http://msdn.microsoft.com/library/default.asp?url=/library/en-us/fileio/
> base/createfile.asp, can you say offhand if the WAL code fulfills them?
If I'm reading it right, you ar
> One point that I no longer recall the reasoning behind is that xlog.c
> doesn't think O_SYNC is a preferable default over fsync. We'd
certainly
> want to hack xlog.c to change its mind about that, at least on
Windows;
> assuming that the FILE_FLAG way is indeed faster.
I also confirmed that the
>> > WinXP fsync = true 20-28 tps
>> > WinXP fsync = false 600 tps
>> > Linux fsync = true 800 tps
>> > Linux fsync = false 980 tps
>>
>> Wow, that's terrible on Windows. If there's a solution, it'd be nice
>to
>> backport it...
>>
>
>there is. I just rigged up a test be
Evgeny Rodichev wrote:
There are two different concerns here.
1. transactions loss because of unexpected power loss and/or system failure
2. inconsistent database state
For many application (1) is fairly acceptable, and (2) is not.
So I'd like to formulate my questions by another way.
- if PostgeSQ
Christopher Kings-Lynne <[EMAIL PROTECTED]> writes:
>> WinXP fsync = true 20-28 tps
>> WinXP fsync = false 600 tps
>> Linux fsync = true 800 tps
>> Linux fsync = false 980 tps
> Wow, that's terrible on Windows. If there's a solution, it'd be nice to
> backport it...
Actu
> > WinXP fsync = true 20-28 tps
> > WinXP fsync = false 600 tps
> > Linux fsync = true 800 tps
> > Linux fsync = false 980 tps
>
> Wow, that's terrible on Windows. If there's a solution, it'd be nice
to
> backport it...
>
there is. I just rigged up a test benchmark com
There are two different concerns here.
1. transactions loss because of unexpected power loss and/or system failure
2. inconsistent database state
For many application (1) is fairly acceptable, and (2) is not.
So I'd like to formulate my questions by another way.
- if PostgeSQL is running without fs
Some addition:
WinXP fsync = true 20-28 tps
WinXP fsync = false 600 tps
Linux fsync = true 800 tps
Linux fsync = false 980 tps
Wow, that's terrible on Windows. If there's a solution, it'd be nice to
backport it...
Chris
---(end of broadcast)---
"Magnus Hagander" <[EMAIL PROTECTED]> writes:
> Oh, and finally. The win32 commands have the following options:
> FILE_FLAG_NO_BUFFERING. This disables the cache completely. It also has
> lots of limits, like every read and write has to be on a sector boundary
> etc. It gives great performance with
> >Doesn't Windows support O_SYNC (or even better O_DSYNC) flag to
open()?
> >That should be the Posixy spelling of FILE_FLAG_WRITE_THROUGH, if the
> >latter means what I suppose it does.
>
> They should, but someone said it didn't work. I haven't followed up on
> it, though, so it is quite possib
>>Doesn't Windows support O_SYNC (or even better O_DSYNC) flag
>to open()?
>>That should be the Posixy spelling of FILE_FLAG_WRITE_THROUGH, if the
>>latter means what I suppose it does.
>
>They should, but someone said it didn't work. I haven't
>followed up on it, though, so it is quite possible
On Thu, 17 Feb 2005, Andrew Dunstan wrote:
(the results are interesting, though - with fsync off Windows and Linux are
in the same performance ballpark.)
Some addition:
WinXP fsync = true 20-28 tps
WinXP fsync = false 600 tps
Linux fsync = true 800 tps
Linux fsync = false 9
>> Things worth experimenting with (these are all untested, so please
>> report any successes):
>> 1) Try reformatting with a cluster size of 8Kb (the pg page size), if
>> you can.
>> 2) Disable the last access time (like noatime on linux). "fsutil
>> behavior set disablelastaccess 1"
>> 3) Disable
"Magnus Hagander" <[EMAIL PROTECTED]> writes:
> This is what we have discovered. AFAIK, all other major databases or
> other similar apps (like exchange or AD) all open files with
> FILE_FLAG_WRITE_THROUGH and do *not* use fsync. It might give noticably
> better performance with an O_DIRECT style W
>> This is what we have discovered. AFAIK, all other major databases or
>> other similar apps (like exchange or AD) all open files with
>> FILE_FLAG_WRITE_THROUGH and do *not* use fsync. It might
>give noticably
>> better performance with an O_DIRECT style WAL logging at
>least. But I'm
>> unsure
>>So by all means turn off fsync if you want the performance gain *and*
>>you accept the risk. But if you do, don't come crying later that your
>>data has been lost or corrupted.
>
>>(the results are interesting, though - with fsync off Windows
>and Linux
>>are in the same performance ballpark.
In <[EMAIL PROTECTED]>, on 02/17/05
at 10:21 AM, Andrew Dunstan <[EMAIL PROTECTED]> said:
>E.Rodichev wrote:
>>
>> This problem is addressed by file system (fsck, journalling etc.).
>> Is it reasonable to handle it directly within application?
>>
>>
>In the words of the Duke of Wellington,
E.Rodichev wrote:
This problem is addressed by file system (fsck, journalling etc.).
Is it reasonable to handle it directly within application?
In the words of the Duke of Wellington, "If you believe that you'll
believe anything."
Please review past discussions on the mailing lists on this poin
"E.Rodichev" <[EMAIL PROTECTED]> writes:
> On Thu, 17 Feb 2005, Christopher Kings-Lynne wrote:
>
>> Fsync is so that when your computer loses power without warning, you
>> will have no data loss.
>>
>> If you turn it off, you run the risk of losing data if you lose power.
>>
>> Chris
>
> This prob
On Thu, 17 Feb 2005 17:54:38 +0300 (MSK)
"E.Rodichev" <[EMAIL PROTECTED]> wrote:
> On Thu, 17 Feb 2005, Christopher Kings-Lynne wrote:
>
> >> The general question is - does PostgreSQL really need fsync? I
> >suppose it> is a question for design, not platform-specific one. It
> >sounds like only> o
On Thu, 17 Feb 2005, Christopher Kings-Lynne wrote:
The general question is - does PostgreSQL really need fsync? I suppose it
is a question for design, not platform-specific one. It sounds like only
one scenario, when fsync is useful, is to interprocess communication via
open file. But PostgreSQL u
On Thu, 17 Feb 2005, Magnus Hagander wrote:
Hi,
looking for the way how to increase performance at Windows XP
box, I found the parameters
#fsync = true # turns forced
synchronization on or off
#wal_sync_method = fsync# the default varies across platforms:
The general question is - does PostgreSQL really need fsync? I suppose it
is a question for design, not platform-specific one. It sounds like only
one scenario, when fsync is useful, is to interprocess communication via
open file. But PostgreSQL utilize IPC for this, so does fsync is really
require
> Things worth experimenting with (these are all untested, so please
> report any successes):
> 1) Try reformatting with a cluster size of 8Kb (the pg page size), if
> you can.
What about recompiling pg with a 4k block size. Win32 file cluster
sizes and memory allocation units are both on 4k boun
> Hi,
>
> looking for the way how to increase performance at Windows XP
> box, I found the parameters
>
> #fsync = true # turns forced
> synchronization on or off
> #wal_sync_method = fsync# the default varies across platforms:
> # fsyn
> looking for the way how to increase performance at Windows XP box, I
found
> the parameters
>
> #fsync = true # turns forced synchronization on or
off
> #wal_sync_method = fsync# the default varies across platforms:
> # fsync, fdatasync,
Hi,
looking for the way how to increase performance at Windows XP box, I found
the parameters
#fsync = true # turns forced synchronization on or off
#wal_sync_method = fsync# the default varies across platforms:
# fsync, fdatasync, open_sync
78 matches
Mail list logo