Hi all, O_DIRECT for WAL writes was discussed at http://archives.postgresql.org/pgsql-patches/2005-06/msg00064.php but I have some items that want to be discussed, so I would like to re-post it to HACKERS.
Bruce Momjian <pgman@candle.pha.pa.us> wrote: > I think the conclusion from the discussion is that O_DIRECT is in > addition to the sync method, rather than in place of it, because > O_DIRECT doesn't have the same media write guarantees as fsync(). Would > you update the patch to do and see if there is a performance win? I tested two combinations, - fsync_direct: O_DIRECT+fsync() - open_direct: O_DIRECT+O_SYNC to compare them with O_DIRECT on my linux machine. The pgbench results still shows a performance win: scale| DBsize | open_sync | fsync=false | O_DIRECT only| fsync_direct | open_direct -----+--------+-----------+--------------+--------------+--------------+--------------- 10 | 150MB | 252.6 tps | 263.5(+ 4.3%)| 253.4(+ 0.3%)| 253.6(+ 0.4%)| 253.3(+ 0.3%) 100 | 1.5GB | 102.7 tps | 117.8(+14.7%)| 147.6(+43.7%)| 148.9(+45.0%)| 150.8(+46.8%) 60runs * pgbench -c 10 -t 1000 on one Pentium4, 1GB mem, 2 ATA disks, Linux 2.6.8 O_DIRECT, fsync_direct and open_direct show the same tendency of performance. There were a win on scale=100, but no win on scale=10, which is a fully in-memory benchmark. The following items still want to be discussed: - Are their names appropriate? Simplify to 'direct'? - Are both fsync_direct and open_direct necessary? MySQL seems to use only O_DIRECT+fsync() combination. - Is it ok to set the dio buffer alignment to BLCKSZ? This is simple way to set the alignment to match many environment. If it is not enough, BLCKSZ would be also a problem for direct io. BTW, IMHO the major benefit of direct io is saving memory. O_DIRECT gives a hint that OS should not cache WAL files. Without direct io, OS might make a effort to cache WAL files, which will never be used, and might discard data file cache. --- ITAGAKI Takahiro NTT Cyber Space Laboratories ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match