Re: [PATCHES] WAL: O_DIRECT and multipage-writer

2005-01-25 Thread Tom Lane
ITAGAKI Takahiro <[EMAIL PROTECTED]> writes:
> I resend the patch with diff -c.

What does XLOG_EXTRA_BUFFERS accomplish?

Also, I'm worried that you broke something by not updating
Write->curridx immediately in XLogWrite.  There certainly isn't going
to be any measurable performance boost from keeping that in a local
variable, so why take any risk?

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PATCHES] WAL: O_DIRECT and multipage-writer

2005-01-25 Thread ITAGAKI Takahiro
Excuse me.
I resend the patch with diff -c.


On Tue, 25 Jan 2005 10:30:01 +0100
"Michael Paesold" <[EMAIL PROTECTED]> wrote:

> ITAGAKI Takahiro wrote:
> 
> > I think that there is room for improvement in WAL.
> > Here is a patch for it.
> 
> I think you should resend your patch as a context diff (diff -c). Otherwise 
> it's hard to see what your patch does.

---
ITAGAKI Takahiro <[EMAIL PROTECTED]>
NTT Cyber Space Laboratories
Nippon Telegraph and Telephone Corporation.


xlog.c.diff
Description: Binary data

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] WAL: O_DIRECT and multipage-writer

2005-01-25 Thread Michael Paesold
ITAGAKI Takahiro wrote:
I think that there is room for improvement in WAL.
Here is a patch for it.
I think you should resend your patch as a context diff (diff -c). Otherwise 
it's hard to see what your patch does.

Best Regards,
Michael Paesold 

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[PATCHES] WAL: O_DIRECT and multipage-writer

2005-01-25 Thread ITAGAKI Takahiro
Hello, all.

I think that there is room for improvement in WAL. 
Here is a patch for it.
  - Multiple pages are written in one write() if it is contiguous.
  - Add 'open_direct' to wal_sync_method.

WAL writer writes one page in one write(). This is not efficient
when wal_sync_method is 'open_sync', because the writer waits for
IO completions at each write(). Multipage-writer can reduce syscalls
and improve IO throughput. 

'open_direct' uses O_DIRECT instead of O_SYNC. O_DIRECT implies synchronous
writing, so it may show the tendency like open_sync. But maybe it can reduce
memcpy() and save OS's disk cache memory.

I benchmarked this patch with pgbench. It works well and 
improved 50% of tps on my machine. WAL seems to be bottle-neck
on machines with poor disks.

This patch has not yet tested enough. I would like it to be examined much
and taken into PostgreSQL.

There are still many TODOs:
  * Is this logic really correct?
  - O_DIRECT_BUFFER_ALIGN should be adjusted to runtime, not compile time.
  - Consider to use writev() instead of write().
Buffers are noncontiguous when WAL ring buffer rotates.
  - If wan_sync_method is not open_direct, XLOG_EXTRA_BUFFERS can be 0.


Sincerely,
ITAGAKI Takahiro



-- pgbench result --

$ ./pgbench -s 100 -c 50 -t 400

- 8.0.0 default + fsync:
tps = 20.630632 (including connections establishing)
tps = 20.636768 (excluding connections establishing)
- multipage-writer + open_direct:
tps = 33.761917 (including connections establishing)
tps = 33.778320 (excluding connections establishing)

Environment:
  OS : Linux kernel 2.6.9
  CPU: Pentium 4 3GHz
  disk   : ATA 5400rpm (Data and WAL are placed on same partition.)
  memory : 1GB
  config : shared_buffers=1, wal_buffers=256,
   XLOG_SEG_SIZE=256MB, checkpoint_segment=4

---
ITAGAKI Takahiro <[EMAIL PROTECTED]>
NTT Cyber Space Laboratories
Nippon Telegraph and Telephone Corporation.


xlog.diff
Description: Binary data

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings