Re: [HACKERS] [PATCHES] O_DIRECT for WAL writes

2005-07-14 Thread Jeffrey W. Baker
On Fri, 2005-06-24 at 09:37 -0400, Tom Lane wrote: ITAGAKI Takahiro [EMAIL PROTECTED] writes: ... So I'll post the new results: checkpoint_ | writeback | segments| cache | open_sync | fsync=false | O_DIRECT only | fsync_direct | open_direct

Re: [HACKERS] [PATCHES] O_DIRECT for WAL writes

2005-07-14 Thread Jeffrey W. Baker
On Fri, 2005-06-24 at 10:19 -0500, Jim C. Nasby wrote: On Fri, Jun 24, 2005 at 09:37:23AM -0400, Tom Lane wrote: ITAGAKI Takahiro [EMAIL PROTECTED] writes: ... So I'll post the new results: checkpoint_ | writeback | segments| cache | open_sync | fsync=false | O_DIRECT

Re: [HACKERS] [PATCHES] O_DIRECT for WAL writes

2005-07-14 Thread Greg Stark
Jeffrey W. Baker [EMAIL PROTECTED] writes: The batteries on a caching RAID controller can run for days at a stretch. It's not as dangerous as people make it sound. And anyone running PG on software RAID is crazy. Get back to us after your first hardware failure when your vendor says the

Re: [HACKERS] [PATCHES] O_DIRECT for WAL writes

2005-07-14 Thread Joshua D. Drake
Greg Stark wrote: Jeffrey W. Baker [EMAIL PROTECTED] writes: The batteries on a caching RAID controller can run for days at a stretch. It's not as dangerous as people make it sound. And anyone running PG on software RAID is crazy. Get back to us after your first hardware failure when

Re: [HACKERS] [PATCHES] O_DIRECT for WAL writes

2005-07-06 Thread Mark Wong
On Fri, 24 Jun 2005 09:21:56 -0700 Josh Berkus josh@agliodbs.com wrote: Jim, Josh, is this something that could be done in the performance lab? That's the idea. Sadly, OSDL's hardware has been having critical failures of late (I'm still trying to get test results on the

Re: [HACKERS] [PATCHES] O_DIRECT for WAL writes

2005-07-02 Thread Bruce Momjian
These patches will require some refactoring and documentation, but I will do that when I apply it. Your patch has been added to the PostgreSQL unapplied patches list at: http://momjian.postgresql.org/cgi-bin/pgpatches It will be applied as soon as one of the PostgreSQL committers

Re: [HACKERS] [PATCHES] O_DIRECT for WAL writes

2005-06-28 Thread ITAGAKI Takahiro
Tom Lane [EMAIL PROTECTED] wrote: Yeah, this is about what I was afraid of: if you're actually fsyncing then you get at best one commit per disk revolution, and the negotiation with the OS is down in the noise. If we disable writeback-cache and use open_sync, the per-page writing behavior in

Re: [HACKERS] [PATCHES] O_DIRECT for WAL writes

2005-06-24 Thread Tom Lane
ITAGAKI Takahiro [EMAIL PROTECTED] writes: ... So I'll post the new results: checkpoint_ | writeback | segments| cache | open_sync | fsync=false | O_DIRECT only | fsync_direct | open_direct

Re: [HACKERS] [PATCHES] O_DIRECT for WAL writes

2005-06-24 Thread Jim C. Nasby
On Fri, Jun 24, 2005 at 09:37:23AM -0400, Tom Lane wrote: ITAGAKI Takahiro [EMAIL PROTECTED] writes: ... So I'll post the new results: checkpoint_ | writeback | segments| cache | open_sync | fsync=false | O_DIRECT only | fsync_direct | open_direct

Re: [HACKERS] [PATCHES] O_DIRECT for WAL writes

2005-06-24 Thread Josh Berkus
Jim, Josh, is this something that could be done in the performance lab? That's the idea. Sadly, OSDL's hardware has been having critical failures of late (I'm still trying to get test results on the checkpointing thing) and the GreenPlum machines aren't up yet. I need to contact those

Re: [HACKERS] [PATCHES] O_DIRECT for WAL writes

2005-06-23 Thread Jim C. Nasby
On Wed, Jun 22, 2005 at 03:50:04PM -0400, Tom Lane wrote: The reason I question automatic is that you really want to test each drive being used, if the system has more than one; but Postgres has no idea what the actual hardware layout is, and so no good way to know what needs to be tested.

Re: [HACKERS] [PATCHES] O_DIRECT for WAL writes

2005-06-23 Thread Douglas McNaught
Jim C. Nasby [EMAIL PROTECTED] writes: Would testing in the WAL directory be sufficient? Or at least better than nothing? Of course we could test in the database directories as well, but you never know if stuff's been symlinked elsewhere... err, we can test for that, no? In any case, it

Re: [HACKERS] [PATCHES] O_DIRECT for WAL writes

2005-06-23 Thread Bruce Momjian
Tom Lane wrote: Greg Stark [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: Unfortunately, I cannot believe these numbers --- the near equality of fsync off and fsync on means there is something very wrong with the measurements. What I suspect is that your ATA drives are

Re: [HACKERS] [PATCHES] O_DIRECT for WAL writes

2005-06-23 Thread Tom Lane
Bruce Momjian pgman@candle.pha.pa.us writes: Tom Lane wrote: The reason I question automatic is that you really want to test each drive being used, if the system has more than one; but Postgres has no idea what the actual hardware layout is, and so no good way to know what needs to be tested.

Re: [HACKERS] [PATCHES] O_DIRECT for WAL writes

2005-06-23 Thread Bruce Momjian
Tom Lane wrote: Gavin Sherry [EMAIL PROTECTED] writes: Curt Sampson [EMAIL PROTECTED] writes: But is it really a problem? I somewhere got the impression that some drives, on power failure, will be able to keep going for long enough to write out the cache and park the heads anyway. If so,

Re: [HACKERS] [PATCHES] O_DIRECT for WAL writes

2005-06-23 Thread ITAGAKI Takahiro
Tom Lane [EMAIL PROTECTED] wrote: Unfortunately, I cannot believe these numbers --- the near equality of fsync off and fsync on means there is something very wrong with the measurements. What I suspect is that your ATA drives are doing write caching and thus the fsyncs are not really waiting

Re: [HACKERS] [PATCHES] O_DIRECT for WAL writes

2005-06-22 Thread Greg Stark
Tom Lane [EMAIL PROTECTED] writes: Unfortunately, I cannot believe these numbers --- the near equality of fsync off and fsync on means there is something very wrong with the measurements. What I suspect is that your ATA drives are doing write caching and thus the fsyncs are not really

Re: [HACKERS] [PATCHES] O_DIRECT for WAL writes

2005-06-22 Thread Tom Lane
Greg Stark [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: Unfortunately, I cannot believe these numbers --- the near equality of fsync off and fsync on means there is something very wrong with the measurements. What I suspect is that your ATA drives are doing write caching and

Re: [HACKERS] [PATCHES] O_DIRECT for WAL writes

2005-06-22 Thread Curt Sampson
On Thu, 22 Jun 2005, Greg Stark wrote: Tom Lane [EMAIL PROTECTED] writes: Unfortunately, I cannot believe these numbers --- the near equality of fsync off and fsync on means there is something very wrong with the measurements. What I suspect is that your ATA drives are doing write caching

Re: [HACKERS] [PATCHES] O_DIRECT for WAL writes

2005-06-22 Thread Tom Lane
Curt Sampson [EMAIL PROTECTED] writes: But regardless, perhaps we can add some stuff to the various OSes' startup scripts that could help with this. For example, in NetBSD you can dkctl device setcache r for most any disk device (certainly all SCSI and ATA) to enable the read cache and disable

Re: [HACKERS] [PATCHES] O_DIRECT for WAL writes

2005-06-22 Thread Curt Sampson
On Wed, 22 Jun 2005, Tom Lane wrote: [ shudder ] I can see the complaints now: Merely starting up Postgres cut my overall system performance by a factor of 10! Yeah, quite the scenario. This can *not* be default behavior, and unfortunately that limits its value quite a lot. Indeed. Maybe

Re: [HACKERS] [PATCHES] O_DIRECT for WAL writes

2005-06-22 Thread Tom Lane
[ on the other point... ] Curt Sampson [EMAIL PROTECTED] writes: But is it really a problem? I somewhere got the impression that some drives, on power failure, will be able to keep going for long enough to write out the cache and park the heads anyway. If so, the drive is still guaranteeing

Re: [HACKERS] [PATCHES] O_DIRECT for WAL writes

2005-06-22 Thread Gavin Sherry
On Thu, 23 Jun 2005, Tom Lane wrote: [ on the other point... ] Curt Sampson [EMAIL PROTECTED] writes: But is it really a problem? I somewhere got the impression that some drives, on power failure, will be able to keep going for long enough to write out the cache and park the heads

Re: [HACKERS] [PATCHES] O_DIRECT for WAL writes

2005-06-22 Thread Gregory Maxwell
On 6/23/05, Gavin Sherry [EMAIL PROTECTED] wrote: inertia) but seeking to a lot of new tracks to write randomly-positioned dirty sectors would require significant energy that just ain't there once the power drops. I seem to recall reading that the seek actuators eat the largest share of

Re: [HACKERS] [PATCHES] O_DIRECT for WAL writes

2005-06-22 Thread Tom Lane
Gavin Sherry [EMAIL PROTECTED] writes: Curt Sampson [EMAIL PROTECTED] writes: But is it really a problem? I somewhere got the impression that some drives, on power failure, will be able to keep going for long enough to write out the cache and park the heads anyway. If so, the drive is still

Re: [HACKERS] [PATCHES] O_DIRECT for WAL writes

2005-06-22 Thread Gavin Sherry
On Thu, 23 Jun 2005, Tom Lane wrote: Gavin Sherry [EMAIL PROTECTED] writes: Curt Sampson [EMAIL PROTECTED] writes: But is it really a problem? I somewhere got the impression that some drives, on power failure, will be able to keep going for long enough to write out the cache and park the

Re: [HACKERS] [PATCHES] O_DIRECT for WAL writes

2005-06-22 Thread Curt Sampson
On Thu, 23 Jun 2005, Tom Lane wrote: The bottom line here seems to be the same as always: you can't run an industrial strength database on piece-of-junk consumer grade hardware. Sure you can, though it may take several bits of piece-of-junk consumer-grade hardware. It's far more about how you

Re: [HACKERS] [PATCHES] O_DIRECT for WAL writes

2005-06-21 Thread Tom Lane
ITAGAKI Takahiro [EMAIL PROTECTED] writes: I tested two combinations, - fsync_direct: O_DIRECT+fsync() - open_direct: O_DIRECT+O_SYNC to compare them with O_DIRECT on my linux machine. The pgbench results still shows a performance win: scale| DBsize | open_sync | fsync=false | O_DIRECT

Re: [HACKERS] [PATCHES] O_DIRECT for WAL writes

2005-06-21 Thread Josh Berkus
Takahiro, scale| DBsize | open_sync | fsync=false | O_DIRECT only| fsync_direct | open_direct -++---+--+--+--+ --- 10 | 150MB | 252.6 tps | 263.5(+ 4.3%)| 253.4(+ 0.3%)| 253.6(+ 0.4%)| 253.3(+ 0.3%) 100 | 1.5GB | 102.7 tps

Re: [HACKERS] [PATCHES] O_DIRECT for WAL writes

2005-06-20 Thread ITAGAKI Takahiro
Hi all, O_DIRECT for WAL writes was discussed at http://archives.postgresql.org/pgsql-patches/2005-06/msg00064.php but I have some items that want to be discussed, so I would like to re-post it to HACKERS. Bruce Momjian pgman@candle.pha.pa.us wrote: I think the conclusion from the discussion