Re: [HACKERS] Faster CREATE DATABASE by delaying fsync (was 8.4.1 ubuntu karmic slow createdb)

2010-01-19 Thread Andres Freund
Hi Greg,

On Tuesday 19 January 2010 15:52:25 Greg Stark wrote:
> On Mon, Jan 18, 2010 at 4:35 PM, Greg Stark  wrote:
> > Looking at this patch for the commitfest I have a few questions.
> 
> So I've touched this patch up a bit:
> 
> 1) moved the posix_fadvise call to a new fd.c function
> pg_fsync_start(fd,offset,nbytes) which initiates an fsync without
> waiting on it. Currently it's only implemented with
> posix_fadvise(DONT_NEED) but I want to look into using sync_file_range
> in the future -- it looks like this call might be good enough for our
> checkpoints.
Why exactly should that depend on fsync? Sure, thats where most of the pain 
comes from now but avoiding that cache poisoning wouldnt hurt otherwise as 
well.

I would rather have it called pg_flush_cache_range or such...

> 2) advised each 64k chunk as we write it which should avoid poisoning
> the cache if you do a large create database on an active system.
> 
> 3) added the promised but afaict missing fsync of the directory -- i
> think we should actually backpatch this.
I think as well. You need it during recursing as well though (where I had 
added it) and not only for the final directory.

> Barring any objections shall I commit it like this?
Other than the two things above it looks fine to me.

Thanks,

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Faster CREATE DATABASE by delaying fsync (was 8.4.1 ubuntu karmic slow createdb)

2009-12-29 Thread Andres Freund
On Monday 28 December 2009 23:59:43 Andres Freund wrote:
> On Monday 28 December 2009 23:54:51 Andres Freund wrote:
> > On Saturday 12 December 2009 21:38:41 Andres Freund wrote:
> > > On Saturday 12 December 2009 21:36:27 Michael Clemmons wrote:
> > > > If ppl think its worth it I'll create a ticket
> > >
> > > Thanks, no need. I will post a patch tomorrow or so.
> >
> > Well. It was a long day...
> >
> > Anyway.
> > In this patch I delay the fsync done in copy_file and simply do a second
> >  pass over the directory in copy_dir and fsync everything in that pass.
> > Including the directory - which was not done before and actually might be
> > necessary in some cases.
> > I added a posix_fadvise(..., FADV_DONTNEED) to make it more likely that
> > the copied file reaches storage before the fsync. Without the speed
> > benefits were quite a bit smaller and essentially random (which seems
> > sensible).
> >
> > This speeds up CREATE DATABASE from ~9 seconds to something around 0.8s
> > on my laptop.  Still slower than with fsync off (~0.25) but quite a
> > worthy improvement.
> >
> > The benefits are obviously bigger if the template database includes
> >  anything added.
> 
> Obviously the patch would be helpfull.
And it should also be helpfull not to have annoying oversights in there. A  
FreeDir(xldir); is missing at the end of copydir().

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Faster CREATE DATABASE by delaying fsync (was 8.4.1 ubuntu karmic slow createdb)

2009-12-29 Thread Andres Freund
On Tuesday 29 December 2009 11:48:10 Greg Stark wrote:
> On Tue, Dec 29, 2009 at 2:05 AM, Andres Freund  wrote:
> >  Reads Completed:2,8KiB  Writes Completed: 2362,  
> >  29672KiB New:
> >  Reads Completed:0,0KiB  Writes Completed:  550,
> > 5960KiB
> 
> It looks like the new method is only doing 1/6th as much i/o. Do you
> know what's going on there?
While I was surprised by the amount of difference I am not surprised at all 
that there is a significant one - currently the fsync will write out a whole 
bunch of useless stuff every time its called (all metadata, directory structure 
and so on)

This is reproducible...

6MB sounds sensible for the operation btw - the template database is around 
5MB.


Will try to analyze later what exactly causes the additional io.


Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Faster CREATE DATABASE by delaying fsync (was 8.4.1 ubuntu karmic slow createdb)

2009-12-29 Thread Greg Stark
On Tue, Dec 29, 2009 at 2:05 AM, Andres Freund  wrote:
>  Reads Completed:        2,        8KiB  Writes Completed:     2362,    
> 29672KiB
> New:
>  Reads Completed:        0,        0KiB  Writes Completed:      550,     
> 5960KiB

It looks like the new method is only doing 1/6th as much i/o. Do you
know what's going on there?


-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [PERFORM] [HACKERS] Faster CREATE DATABASE by delaying fsync (was 8.4.1 ubuntu karmic slow createdb)

2009-12-28 Thread Andres Freund
On Tuesday 29 December 2009 04:04:06 Michael Clemmons wrote:
> Maybe not crash out but in this situation.
> N=0
> while(N>=0):
> CREATE DATABASE new_db_N;
> Since the fsync is the part which takes the memory and time but is
>  happening in the background want the fsyncs pile up in the background
>  faster than can be run filling up the memory and stack.
> This is very likely a mistake on my part about how postgres/processes
The difference should not be visible outside the "CREATE DATABASE ..." at all.
Currently the process simplifiedly works like:


for file in source directory:
copy_file(source/file, target/file);
fsync(target/file);


I changed it to:

-
for file in source directory:
copy_file(source/file, target/file);

/*please dear kernel, write this out, but dont block*/
posix_fadvise(target/file, FADV_DONTNEED); 

for file in source directory:
fsync(target/file);
-

If at any point in time there is not enough cache available to cache anything 
copy_file() will just have to wait for the kernel to write out the data.
fsync() does not use memory itself.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [PERFORM] [HACKERS] Faster CREATE DATABASE by delaying fsync (was 8.4.1 ubuntu karmic slow createdb)

2009-12-28 Thread Michael Clemmons
Maybe not crash out but in this situation.
N=0
while(N>=0):
CREATE DATABASE new_db_N;
Since the fsync is the part which takes the memory and time but is happening
in the background want the fsyncs pile up in the background faster than can
be run filling up the memory and stack.
This is very likely a mistake on my part about how postgres/processes
actually works.
-Michael

On Mon, Dec 28, 2009 at 9:55 PM, Andres Freund  wrote:

> On Tuesday 29 December 2009 03:53:12 Michael Clemmons wrote:
> > Andres,
> > Great job.  Looking through the emails and thinking about why this works
> I
> > think this patch should significantly speedup 8.4 on most any file
> > system(obviously some more than others) unless the system has
> significantly
> > reduced memory or a slow single core. On a Celeron with 256 memory I
> >  suspect it'll crash out or just hit the swap  and be a worse bottleneck.
> >  Anyone have something like this to test on?
> Why should it crash? The kernel should just block on writing and write out
> the
> dirty memory before continuing?
> Pg is not caching anything here...
>
> Andres
>


Re: [PERFORM] [HACKERS] Faster CREATE DATABASE by delaying fsync (was 8.4.1 ubuntu karmic slow createdb)

2009-12-28 Thread Andres Freund
On Tuesday 29 December 2009 03:53:12 Michael Clemmons wrote:
> Andres,
> Great job.  Looking through the emails and thinking about why this works I
> think this patch should significantly speedup 8.4 on most any file
> system(obviously some more than others) unless the system has significantly
> reduced memory or a slow single core. On a Celeron with 256 memory I
>  suspect it'll crash out or just hit the swap  and be a worse bottleneck. 
>  Anyone have something like this to test on?
Why should it crash? The kernel should just block on writing and write out the 
dirty memory before continuing?
Pg is not caching anything here...

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [PERFORM] [HACKERS] Faster CREATE DATABASE by delaying fsync (was 8.4.1 ubuntu karmic slow createdb)

2009-12-28 Thread Michael Clemmons
Andres,
Great job.  Looking through the emails and thinking about why this works I
think this patch should significantly speedup 8.4 on most any file
system(obviously some more than others) unless the system has significantly
reduced memory or a slow single core. On a Celeron with 256 memory I suspect
it'll crash out or just hit the swap  and be a worse bottleneck.  Anyone
have something like this to test on?
-Michael

On Mon, Dec 28, 2009 at 9:05 PM, Andres Freund  wrote:

> On Tuesday 29 December 2009 01:46:21 Greg Smith wrote:
> > Andres Freund wrote:
> > > As I said the real benefit only occurred after adding posix_fadvise(..,
> > > FADV_DONTNEED) which is somewhat plausible, because i.e. the directory
> > > entries don't need to get scheduled for every file and because the
> kernel
> > > can reorder a whole directory nearly sequentially. Without the advice
> it
> > > the kernel doesn't know in time that it should write that data back and
> > > it wont do it for 5 seconds by default on linux or such...
> > It would be interesting to graph the "Dirty" and "Writeback" figures in
> > /proc/meminfo over time with and without this patch in place.  That
> > should make it obvious what the kernel is doing differently in the two
> > cases.
> I did some analysis using blktrace (usefull tool btw) and the results show
> that
> the io pattern is *significantly* different.
>
> For one with the direct fsyncing nearly no hardware queuing is used and for
> another nearly no requests are merged on software side.
>
> Short stats:
>
> OLD:
>
> Total (8,0):
>  Reads Queued:   2,8KiB  Writes Queued:7854,
>  29672KiB
>  Read Dispatches:2,8KiB  Write Dispatches: 1926,
>  29672KiB
>  Reads Requeued: 0   Writes Requeued: 0
>  Reads Completed:2,8KiB  Writes Completed: 2362,
>  29672KiB
>  Read Merges:0,0KiB  Write Merges: 5492,
>  21968KiB
>  PC Reads Queued:0,0KiB  PC Writes Queued:0,
>  0KiB
>  PC Read Disp.:436,0KiB  PC Write Disp.:  0,
>  0KiB
>  PC Reads Req.:  0   PC Writes Req.:  0
>  PC Reads Compl.:0   PC Writes Compl.: 2362
>  IO unplugs:  2395   Timer unplugs: 557
>
>
> New:
>
> Total (8,0):
>  Reads Queued:   0,0KiB  Writes Queued:1716,
> 5960KiB
>  Read Dispatches:0,0KiB  Write Dispatches:  324,
> 5960KiB
>  Reads Requeued: 0   Writes Requeued: 0
>  Reads Completed:0,0KiB  Writes Completed:  550,
> 5960KiB
>  Read Merges:0,0KiB  Write Merges: 1166,
> 4664KiB
>  PC Reads Queued:0,0KiB  PC Writes Queued:0,
>  0KiB
>  PC Read Disp.:226,0KiB  PC Write Disp.:  0,
>  0KiB
>  PC Reads Req.:  0   PC Writes Req.:  0
>  PC Reads Compl.:0   PC Writes Compl.:  550
>  IO unplugs:   503   Timer unplugs:  30
>
>
> Andres
>


Re: [PERFORM] [HACKERS] Faster CREATE DATABASE by delaying fsync (was 8.4.1 ubuntu karmic slow createdb)

2009-12-28 Thread Andres Freund
On Tuesday 29 December 2009 01:46:21 Greg Smith wrote:
> Andres Freund wrote:
> > As I said the real benefit only occurred after adding posix_fadvise(..,
> > FADV_DONTNEED) which is somewhat plausible, because i.e. the directory
> > entries don't need to get scheduled for every file and because the kernel
> > can reorder a whole directory nearly sequentially. Without the advice it
> > the kernel doesn't know in time that it should write that data back and
> > it wont do it for 5 seconds by default on linux or such...
> It would be interesting to graph the "Dirty" and "Writeback" figures in
> /proc/meminfo over time with and without this patch in place.  That
> should make it obvious what the kernel is doing differently in the two
> cases.
I did some analysis using blktrace (usefull tool btw) and the results show that
the io pattern is *significantly* different.

For one with the direct fsyncing nearly no hardware queuing is used and for
another nearly no requests are merged on software side.

Short stats:

OLD:

Total (8,0):
 Reads Queued:   2,8KiB  Writes Queued:7854,29672KiB
 Read Dispatches:2,8KiB  Write Dispatches: 1926,29672KiB
 Reads Requeued: 0   Writes Requeued: 0
 Reads Completed:2,8KiB  Writes Completed: 2362,29672KiB
 Read Merges:0,0KiB  Write Merges: 5492,21968KiB
 PC Reads Queued:0,0KiB  PC Writes Queued:0,0KiB
 PC Read Disp.:436,0KiB  PC Write Disp.:  0,0KiB
 PC Reads Req.:  0   PC Writes Req.:  0
 PC Reads Compl.:0   PC Writes Compl.: 2362
 IO unplugs:  2395   Timer unplugs: 557


New:

Total (8,0):
 Reads Queued:   0,0KiB  Writes Queued:1716, 5960KiB
 Read Dispatches:0,0KiB  Write Dispatches:  324, 5960KiB
 Reads Requeued: 0   Writes Requeued: 0
 Reads Completed:0,0KiB  Writes Completed:  550, 5960KiB
 Read Merges:0,0KiB  Write Merges: 1166, 4664KiB
 PC Reads Queued:0,0KiB  PC Writes Queued:0,0KiB
 PC Read Disp.:226,0KiB  PC Write Disp.:  0,0KiB
 PC Reads Req.:  0   PC Writes Req.:  0
 PC Reads Compl.:0   PC Writes Compl.:  550
 IO unplugs:   503   Timer unplugs:  30


Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [PERFORM] [HACKERS] Faster CREATE DATABASE by delaying fsync (was 8.4.1 ubuntu karmic slow createdb)

2009-12-28 Thread Greg Smith

Andres Freund wrote:
As I said the real benefit only occurred after adding posix_fadvise(.., 
FADV_DONTNEED) which is somewhat plausible, because i.e. the directory entries 
don't need to get scheduled for every file and because the kernel can reorder a 
whole directory nearly sequentially. Without the advice it the kernel doesn't 
know in time that it should write that data back and it wont do it for 5 
seconds by default on linux or such...
  
I know they just fiddled with the logic in the last release, but for 
most of the Linux kernels out there now pdflush wakes up every 5 seconds 
by default.  But typically it only worries about writing things that 
have been in the queue for 30 seconds or more until you've filled quite 
a bit of memory, so that's also an interesting number.  I tried to 
document the main tunables here and describe how they fit together at 
http://www.westnet.com/~gsmith/content/linux-pdflush.htm


It would be interesting to graph the "Dirty" and "Writeback" figures in 
/proc/meminfo over time with and without this patch in place.  That 
should make it obvious what the kernel is doing differently in the two 
cases.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Faster CREATE DATABASE by delaying fsync (was 8.4.1 ubuntu karmic slow createdb)

2009-12-28 Thread Andres Freund
On Tuesday 29 December 2009 00:06:28 Tom Lane wrote:
> Andres Freund  writes:
> > This speeds up CREATE DATABASE from ~9 seconds to something around 0.8s
> > on my laptop.  Still slower than with fsync off (~0.25) but quite a
> > worthy improvement.
> 
> I can't help wondering whether that's real or some kind of
> platform-specific artifact.  I get numbers more like 3.5s (fsync off)
> vs 4.5s (fsync on) on a machine where I believe the disks aren't lying
> about write-complete.  It makes sense that an fsync at the end would be
> a little bit faster, because it would give the kernel some additional
> freedom in scheduling the required I/O, but it isn't cutting the total
> I/O required at all.  So I find it really hard to believe a 10x speedup.
I only comfortably have access to two smaller machines without BBU from here 
(being in the Hacker Jeopardy at the ccc congress ;-)) and both show this 
behaviour. I guess its somewhat filesystem dependent. 

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Faster CREATE DATABASE by delaying fsync (was 8.4.1 ubuntu karmic slow createdb)

2009-12-28 Thread Andres Freund
On Tuesday 29 December 2009 00:06:28 Tom Lane wrote:
> Andres Freund  writes:
> > This speeds up CREATE DATABASE from ~9 seconds to something around 0.8s
> > on my laptop.  Still slower than with fsync off (~0.25) but quite a
> > worthy improvement.
> I can't help wondering whether that's real or some kind of
> platform-specific artifact.  I get numbers more like 3.5s (fsync off)
> vs 4.5s (fsync on) on a machine where I believe the disks aren't lying
> about write-complete.  It makes sense that an fsync at the end would be
> a little bit faster, because it would give the kernel some additional
> freedom in scheduling the required I/O, but it isn't cutting the total
> I/O required at all.  So I find it really hard to believe a 10x speedup.
Well, a template database is about 5.5MB big here - that shouldnt take too 
long when written near-sequentially?
As I said the real benefit only occurred after adding posix_fadvise(.., 
FADV_DONTNEED) which is somewhat plausible, because i.e. the directory entries 
don't need to get scheduled for every file and because the kernel can reorder a 
whole directory nearly sequentially. Without the advice it the kernel doesn't 
know in time that it should write that data back and it wont do it for 5 
seconds by default on linux or such...

I looked at the strace output - it looks sensible timewise to me. If youre 
interested I can give you output of that.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Faster CREATE DATABASE by delaying fsync (was 8.4.1 ubuntu karmic slow createdb)

2009-12-28 Thread Tom Lane
Andres Freund  writes:
> This speeds up CREATE DATABASE from ~9 seconds to something around 0.8s on my
> laptop.  Still slower than with fsync off (~0.25) but quite a worthy 
> improvement.

I can't help wondering whether that's real or some kind of
platform-specific artifact.  I get numbers more like 3.5s (fsync off)
vs 4.5s (fsync on) on a machine where I believe the disks aren't lying
about write-complete.  It makes sense that an fsync at the end would be
a little bit faster, because it would give the kernel some additional
freedom in scheduling the required I/O, but it isn't cutting the total
I/O required at all.  So I find it really hard to believe a 10x speedup.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Faster CREATE DATABASE by delaying fsync (was 8.4.1 ubuntu karmic slow createdb)

2009-12-28 Thread Andres Freund
On Monday 28 December 2009 23:54:51 Andres Freund wrote:
> On Saturday 12 December 2009 21:38:41 Andres Freund wrote:
> > On Saturday 12 December 2009 21:36:27 Michael Clemmons wrote:
> > > If ppl think its worth it I'll create a ticket
> >
> > Thanks, no need. I will post a patch tomorrow or so.
> 
> Well. It was a long day...
> 
> Anyway.
> In this patch I delay the fsync done in copy_file and simply do a second
>  pass over the directory in copy_dir and fsync everything in that pass.
> Including the directory - which was not done before and actually might be
> necessary in some cases.
> I added a posix_fadvise(..., FADV_DONTNEED) to make it more likely that the
> copied file reaches storage before the fsync. Without the speed benefits
>  were quite a bit smaller and essentially random (which seems sensible).
> 
> This speeds up CREATE DATABASE from ~9 seconds to something around 0.8s on
>  my laptop.  Still slower than with fsync off (~0.25) but quite a worthy
>  improvement.
> 
> The benefits are obviously bigger if the template database includes
>  anything added.
Obviously the patch would be helpfull.

Andres
From bd80748883d1328a71607a447677b0bfb1f54ab0 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Mon, 28 Dec 2009 23:43:57 +0100
Subject: [PATCH] Delay fsyncing files during copying in CREATE DATABASE - this
 dramatically speeds up CREATE DATABASE on non battery backed
 rotational storage.
 Additionally fsync() the directory to ensure all metadata reaches
 storage.

---
 src/port/copydir.c |   58 +--
 1 files changed, 51 insertions(+), 7 deletions(-)

diff --git a/src/port/copydir.c b/src/port/copydir.c
index a70477e..cde3dc7 100644
*** a/src/port/copydir.c
--- b/src/port/copydir.c
***
*** 37,42 
--- 37,43 
  
  
  static void copy_file(char *fromfile, char *tofile);
+ static void fsync_fname(char *fname);
  
  
  /*
*** copydir(char *fromdir, char *todir, bool
*** 64,69 
--- 65,73 
  (errcode_for_file_access(),
   errmsg("could not open directory \"%s\": %m", fromdir)));
  
+ 	/*
+ 	 * Copy all the files
+ 	 */
  	while ((xlde = ReadDir(xldir, fromdir)) != NULL)
  	{
  		struct stat fst;
*** copydir(char *fromdir, char *todir, bool
*** 89,96 
  		else if (S_ISREG(fst.st_mode))
  			copy_file(fromfile, tofile);
  	}
- 
  	FreeDir(xldir);
  }
  
  /*
--- 93,120 
  		else if (S_ISREG(fst.st_mode))
  			copy_file(fromfile, tofile);
  	}
  	FreeDir(xldir);
+ 
+ 	/*
+ 	 * Be paranoid here and fsync all files to ensure we catch problems.
+ 	 */
+ 	xldir = AllocateDir(fromdir);
+ 	if (xldir == NULL)
+ 		ereport(ERROR,
+ (errcode_for_file_access(),
+  errmsg("could not open directory \"%s\": %m", fromdir)));
+ 
+ 	while ((xlde = ReadDir(xldir, fromdir)) != NULL)
+ 	{
+ 		struct stat fst;
+ 
+ 		if (strcmp(xlde->d_name, ".") == 0 ||
+ 			strcmp(xlde->d_name, "..") == 0)
+ 			continue;
+ 
+ 		snprintf(tofile, MAXPGPATH, "%s/%s", todir, xlde->d_name);
+ 		fsync_fname(tofile);
+ 	}
  }
  
  /*
*** copy_file(char *fromfile, char *tofile)
*** 150,162 
  	}
  
  	/*
! 	 * Be paranoid here to ensure we catch problems.
  	 */
! 	if (pg_fsync(dstfd) != 0)
! 		ereport(ERROR,
! (errcode_for_file_access(),
!  errmsg("could not fsync file \"%s\": %m", tofile)));
! 
  	if (close(dstfd))
  		ereport(ERROR,
  (errcode_for_file_access(),
--- 174,185 
  	}
  
  	/*
! 	 * We tell the kernel here to write the data back in order to make
! 	 * the later fsync cheaper.
  	 */
! #if defined(USE_POSIX_FADVISE) && defined(POSIX_FADV_DONTNEED)
! 	posix_fadvise(dstfd, 0, 0, POSIX_FADV_DONTNEED);
! #endif
  	if (close(dstfd))
  		ereport(ERROR,
  (errcode_for_file_access(),
*** copy_file(char *fromfile, char *tofile)
*** 166,168 
--- 189,212 
  
  	pfree(buffer);
  }
+ 
+ /*
+  * fsync a file
+  */
+ static void
+ fsync_fname(char *fname)
+ {
+ 	int	fd = BasicOpenFile(fname, O_RDWR| PG_BINARY,
+ 		  S_IRUSR | S_IWUSR);
+ 
+ 	if (fd < 0)
+ 		ereport(ERROR,
+ (errcode_for_file_access(),
+  errmsg("could not create file \"%s\": %m", fname)));
+ 
+ 	if (pg_fsync(fd) != 0)
+ 		ereport(ERROR,
+ (errcode_for_file_access(),
+  errmsg("could not fsync file \"%s\": %m", fname)));
+ 	close(fd);
+ }
-- 
1.6.5.12.gd65df24


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Faster CREATE DATABASE by delaying fsync (was 8.4.1 ubuntu karmic slow createdb)

2009-12-28 Thread Andres Freund
On Saturday 12 December 2009 21:38:41 Andres Freund wrote:
> On Saturday 12 December 2009 21:36:27 Michael Clemmons wrote:
> > If ppl think its worth it I'll create a ticket
> Thanks, no need. I will post a patch tomorrow or so.
Well. It was a long day...

Anyway.
In this patch I delay the fsync done in copy_file and simply do a second pass 
over the directory in copy_dir and fsync everything in that pass.
Including the directory - which was not done before and actually might be 
necessary in some cases.
I added a posix_fadvise(..., FADV_DONTNEED) to make it more likely that the 
copied file reaches storage before the fsync. Without the speed benefits were 
quite a bit smaller and essentially random (which seems sensible).

This speeds up CREATE DATABASE from ~9 seconds to something around 0.8s on my 
laptop.  Still slower than with fsync off (~0.25) but quite a worthy 
improvement.

The benefits are obviously bigger if the template database includes anything 
added.


Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers