Re: [HACKERS] Load distributed checkpoint V4

2007-04-23 Thread Heikki Linnakangas
ITAGAKI Takahiro wrote: Heikki Linnakangas <[EMAIL PROTECTED]> wrote: We might want to call GetCheckpointProgress something else, though. It doesn't return the amount of progress made, but rather the amount of progress we should've made up to that point or we're in danger of not completing the

Re: [HACKERS] Load distributed checkpoint V4

2007-04-22 Thread ITAGAKI Takahiro
Heikki Linnakangas <[EMAIL PROTECTED]> wrote: Thanks for making clearly understandable my patch! > We might want to call GetCheckpointProgress something > else, though. It doesn't return the amount of progress made, but rather > the amount of progress we should've made up to that point or we're

Re: [PATCHES] [HACKERS] Load distributed checkpoint

2007-02-27 Thread ITAGAKI Takahiro
"Inaam Rana" <[EMAIL PROTECTED]> wrote: > One of the issues we had during testing with original patch was db stop not > working properly. I think you coded something to do a stop checkpoint in > immediately but if a checkpoint is already in progress at that time, it > would take its own time to c

Re: [PATCHES] [HACKERS] Load distributed checkpoint

2007-02-26 Thread Inaam Rana
On 2/26/07, ITAGAKI Takahiro <[EMAIL PROTECTED]> wrote: Josh Berkus wrote: > Can I have a copy of the patch to add to the Sun testing queue? This is the revised version of the patch. Delay factors in checkpoints can be specified by checkpoint_write_percent, checkpoint_nap_percent and checkpoi

Re: [HACKERS] Load distributed checkpoint

2007-02-26 Thread ITAGAKI Takahiro
Josh Berkus wrote: > Can I have a copy of the patch to add to the Sun testing queue? This is the revised version of the patch. Delay factors in checkpoints can be specified by checkpoint_write_percent, checkpoint_nap_percent and checkpoint_sync_percent. They are relative to checkpoint_timeout.

Re: [HACKERS] Load distributed checkpoint

2007-02-26 Thread Josh Berkus
Itagaki, > Thank you for testing! Yes, I'm cleaning the patch. I changed > configuration parameters to delay each phase in checkpoints from setting > absolute times (checkpoint_xxx_duration) to setting relative to > checkpoint_timeout (checkpoint_xxx_percent). Delay factors strongly > depend on to

Re: [HACKERS] Load distributed checkpoint

2007-02-26 Thread ITAGAKI Takahiro
"Inaam Rana" <[EMAIL PROTECTED]> wrote: > Did you had a chance to look into this any further? We, at EnterpriseDB, > have done some testing on this patch (dbt2 runs) and it looks like we are > getting the desired results, particularly so when we spread out both sync > and write phases. Thank you

Re: [HACKERS] Load distributed checkpoint

2007-02-26 Thread Inaam Rana
On 12/19/06, ITAGAKI Takahiro <[EMAIL PROTECTED]> wrote: "Takayuki Tsunakawa" <[EMAIL PROTECTED]> wrote: > I performed some simple tests, and I'll show the results below. > (1) The default case > 235 80 226 77 240 > (2) No write case > 242 250 244 253 280 > (3) No checkpoint case > 229

Re: [HACKERS] Load distributed checkpoint

2007-02-02 Thread Bruce Momjian
Thread added to TODO list: * Reduce checkpoint performance degredation by forcing data to disk more evenly http://archives.postgresql.org/pgsql-hackers/2006-12/msg00337.php http://archives.postgresql.org/pgsql-hackers/2007-01/msg00079.php -

Re: [HACKERS] Load distributed checkpoint

2007-01-11 Thread Inaam Rana
No, I've not tried yet. Inaam-san told me that Linux had a few I/O schedulers but I'm not familiar with them. I'll find information about them (how to change the scheduler settings) and try the same test. I am sorry, your response just slipped by me. The docs for RHEL (I believe you are runn

Re: [HACKERS] Load distributed checkpoint

2007-01-09 Thread ITAGAKI Takahiro
I wrote: > I'm thinking about generalizing your idea; Adding three parameters > to control sleeps in each stage. I sent a patch to -patches that adds 3+1 GUC parameters for checkpoints. We can use three of them to control sleeps in each stage during checkpoints. The last is an experimental approac

Re: [HACKERS] Load distributed checkpoint

2007-01-08 Thread Takayuki Tsunakawa
Tsunakawa" <[EMAIL PROTECTED]> Cc: "ITAGAKI Takahiro" <[EMAIL PROTECTED]>; Sent: Thursday, December 28, 2006 7:07 AM Subject: Re: [HACKERS] Load distributed checkpoint > On Mon, 2006-12-18 at 14:47 +0900, Takayuki Tsunakawa wrote: >> Hello, Itagaki-san, all >&g

Re: [HACKERS] Load distributed checkpoint

2007-01-03 Thread Zeugswetter Andreas ADI SD
> > I believe there's something similar for OS X as well. The question is: > > would it be better to do that, or to just delay calling fsync until the > > OS has had a chance to write things out. > > A delay is not going to help unless you can suppress additional writes > to the file, which I don

Re: [HACKERS] Load distributed checkpoint

2006-12-29 Thread Jim C. Nasby
On Fri, Dec 29, 2006 at 09:02:11PM -0500, Bruce Momjian wrote: > Tom Lane wrote: > > "Jim C. Nasby" <[EMAIL PROTECTED]> writes: > > > I believe there's something similar for OS X as well. The question is: > > > would it be better to do that, or to just delay calling fsync until the > > > OS has had

Re: [HACKERS] Load distributed checkpoint

2006-12-29 Thread Bruce Momjian
Tom Lane wrote: > "Jim C. Nasby" <[EMAIL PROTECTED]> writes: > > I believe there's something similar for OS X as well. The question is: > > would it be better to do that, or to just delay calling fsync until the > > OS has had a chance to write things out. > > A delay is not going to help unless y

Re: [HACKERS] Load distributed checkpoint

2006-12-29 Thread Tom Lane
"Jim C. Nasby" <[EMAIL PROTECTED]> writes: > I believe there's something similar for OS X as well. The question is: > would it be better to do that, or to just delay calling fsync until the > OS has had a chance to write things out. A delay is not going to help unless you can suppress additional w

Re: [HACKERS] Load distributed checkpoint

2006-12-29 Thread Jim C. Nasby
On Thu, Dec 28, 2006 at 09:28:48PM +, Heikki Linnakangas wrote: > Tom Lane wrote: > >To my mind the problem with fsync is not that it gives us too little > >control but that it gives too much: we have to specify a particular > >order of writing out files. What we'd really like is a version of

Re: [HACKERS] Load distributed checkpoint

2006-12-28 Thread Heikki Linnakangas
Tom Lane wrote: To my mind the problem with fsync is not that it gives us too little control but that it gives too much: we have to specify a particular order of writing out files. What we'd really like is a version of sync(2) that tells us when it's done but doesn't constrain the I/O scheduler'

Re: [HACKERS] Load distributed checkpoint

2006-12-28 Thread Bruce Momjian
Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > > Jim C. Nasby wrote: > >> What about the mmap/msync(?)/munmap idea someone mentioned? > > > I see that as similar to using O_DIRECT during checkpoint, which had > > poor performance. > > That's a complete nonstarter on portability gro

Re: [HACKERS] Load distributed checkpoint

2006-12-28 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes: > Jim C. Nasby wrote: >> What about the mmap/msync(?)/munmap idea someone mentioned? > I see that as similar to using O_DIRECT during checkpoint, which had > poor performance. That's a complete nonstarter on portability grounds, even if msync gave us the

Re: [HACKERS] Load distributed checkpoint

2006-12-28 Thread Bruce Momjian
Jim C. Nasby wrote: > On Thu, Dec 28, 2006 at 12:50:19PM -0500, Bruce Momjian wrote: > > To summarize, if we could have fsync() only write the dirty buffers that > > happened as part of the checkpoint, we could delay the write() for the > > entire time between checkpoints, but we can't do that, so

Re: [HACKERS] Load distributed checkpoint

2006-12-28 Thread Jim C. Nasby
On Thu, Dec 28, 2006 at 12:50:19PM -0500, Bruce Momjian wrote: > To summarize, if we could have fsync() only write the dirty buffers that > happened as part of the checkpoint, we could delay the write() for the > entire time between checkpoints, but we can't do that, so we have to > make it user-tu

Re: [HACKERS] Load distributed checkpoint

2006-12-28 Thread Bruce Momjian
ITAGAKI Takahiro wrote: > > Bruce Momjian <[EMAIL PROTECTED]> wrote: > > > > 566.973777 > > > 327.158222 <- (1) write() > > > 560.773868 <- (2) sleep > > > 544.106645 <- (3) fsync() > > > > OK, so you are saying that performance dropped only during the write(), > > and not during the fsync()? I

Re: [HACKERS] Load distributed checkpoint

2006-12-28 Thread Jim C. Nasby
On Wed, Dec 27, 2006 at 10:54:57PM +, Simon Riggs wrote: > On Wed, 2006-12-27 at 23:26 +0100, Martijn van Oosterhout wrote: > > On Wed, Dec 27, 2006 at 09:24:06PM +, Simon Riggs wrote: > > > On Fri, 2006-12-22 at 13:53 -0500, Bruce Momjian wrote: > > > > > > > I assume other kernels have s

Re: [HACKERS] Load distributed checkpoint

2006-12-28 Thread Ron Mayer
Gregory Stark wrote: > "Bruce Momjian" <[EMAIL PROTECTED]> writes: > >> I have a new idea. ...the BSD kernel...similar issue...to smooth writes: > Linux has a more complex solution to this (of course) which has undergone a > few generations over time. Older kernels had a user space daemon called

Re: [HACKERS] Load distributed checkpoint

2006-12-27 Thread ITAGAKI Takahiro
Bruce Momjian <[EMAIL PROTECTED]> wrote: > > 566.973777 > > 327.158222 <- (1) write() > > 560.773868 <- (2) sleep > > 544.106645 <- (3) fsync() > > OK, so you are saying that performance dropped only during the write(), > and not during the fsync()? Interesting. Almost yes, but there is a smal

Re: [HACKERS] Load distributed checkpoint

2006-12-27 Thread Simon Riggs
On Wed, 2006-12-27 at 23:26 +0100, Martijn van Oosterhout wrote: > On Wed, Dec 27, 2006 at 09:24:06PM +, Simon Riggs wrote: > > On Fri, 2006-12-22 at 13:53 -0500, Bruce Momjian wrote: > > > > > I assume other kernels have similar I/O smoothing, so that data sent to > > > the kernel via write()

Re: [HACKERS] Load distributed checkpoint

2006-12-27 Thread Martijn van Oosterhout
On Wed, Dec 27, 2006 at 09:24:06PM +, Simon Riggs wrote: > On Fri, 2006-12-22 at 13:53 -0500, Bruce Momjian wrote: > > > I assume other kernels have similar I/O smoothing, so that data sent to > > the kernel via write() gets to disk within 30 seconds. > > > > I assume write() is not our che

Re: [HACKERS] Load distributed checkpoint

2006-12-27 Thread Simon Riggs
On Mon, 2006-12-18 at 14:47 +0900, Takayuki Tsunakawa wrote: > Hello, Itagaki-san, all > > Sorry for my long mail. I've had trouble in sending this mail because > it's too long for pgsql-hackers to accept (I couldn't find how large > mail is accepted.) So I'm trying to send several times. > Ple

Re: [HACKERS] Load distributed checkpoint

2006-12-27 Thread Simon Riggs
On Fri, 2006-12-22 at 13:53 -0500, Bruce Momjian wrote: > I assume other kernels have similar I/O smoothing, so that data sent to > the kernel via write() gets to disk within 30 seconds. > > I assume write() is not our checkpoint performance problem, but the > transfer to disk via fsync(). W

Re: [HACKERS] Load distributed checkpoint

2006-12-26 Thread Bruce Momjian
ITAGAKI Takahiro wrote: > > Bruce Momjian <[EMAIL PROTECTED]> wrote: > > > I assume write() is not our checkpoint performance problem, but the > > transfer to disk via fsync(). Perhaps a simple solution is to do the > > write()'s of all dirty buffers as we do now at checkpoint time, but > > dela

Re: [HACKERS] Load distributed checkpoint

2006-12-26 Thread ITAGAKI Takahiro
Bruce Momjian <[EMAIL PROTECTED]> wrote: > I assume write() is not our checkpoint performance problem, but the > transfer to disk via fsync(). Perhaps a simple solution is to do the > write()'s of all dirty buffers as we do now at checkpoint time, but > delay 30 seconds and then do fsync() on al

Re: [HACKERS] Load distributed checkpoint

2006-12-25 Thread Takayuki Tsunakawa
From: "Bruce Momjian" <[EMAIL PROTECTED]> > On an idle system, would someone dirty a large file, and watch the disk > I/O to see how long it takes for the I/O to complete to disk? I ran "dd if=/dev/zero of= bs=8k count=`expr 1048576 / 8`, that is, writing 1GB file with 8KB write()'s. It took abou

Re: [HACKERS] Load distributed checkpoint

2006-12-25 Thread Takayuki Tsunakawa
Hello, Inaam-san, > There are four IO schedulers in Linux. Anticipatory, CFQ (default), deadline, and noop. For typical OLTP type loads generally deadline is recommended. If you are constrained on CPU and you have a good controller then its better to use noop. > Deadline attempts to merge requests

Re: [HACKERS] Load distributed checkpoint

2006-12-22 Thread Bruce Momjian
Gregory Stark wrote: > > "Bruce Momjian" <[EMAIL PROTECTED]> writes: > > > I have a new idea. Rather than increasing write activity as we approach > > checkpoint, I think there is an easier solution. I am very familiar > > with the BSD kernel, and it seems they have a similar issue in trying to

Re: [HACKERS] Load distributed checkpoint

2006-12-22 Thread Greg Smith
On Fri, 22 Dec 2006, Simon Riggs wrote: I have also seen cases where the WAL drive, even when separated, appears to spike upwards during a checkpoint. My best current theory, so far untested, is that the WAL and data drives are using the same CFQ scheduler and that the scheduler actively slows d

Re: [HACKERS] Load distributed checkpoint

2006-12-22 Thread Gregory Stark
"Bruce Momjian" <[EMAIL PROTECTED]> writes: > I have a new idea. Rather than increasing write activity as we approach > checkpoint, I think there is an easier solution. I am very familiar > with the BSD kernel, and it seems they have a similar issue in trying to > smooth writes: Just to give a

Re: [HACKERS] Load distributed checkpoint

2006-12-22 Thread Inaam Rana
On 12/22/06, Takayuki Tsunakawa <[EMAIL PROTECTED]> wrote: From: Inaam Rana > Which IO Shceduler (elevator) you are using? Elevator? Sorry, I'm not familiar with the kernel implementation, so I don't what it is. My Linux distribution is Red Hat Enterprise Linux 4.0for AMD64/EM64T, and the k

Re: [HACKERS] Load distributed checkpoint

2006-12-22 Thread Bruce Momjian
I have a new idea. Rather than increasing write activity as we approach checkpoint, I think there is an easier solution. I am very familiar with the BSD kernel, and it seems they have a similar issue in trying to smooth writes: http://www.brno.cas.cz/cgi-bin/bsdi-man?proto=1.1&

Re: [HACKERS] Load distributed checkpoint

2006-12-22 Thread Simon Riggs
On Thu, 2006-12-21 at 18:46 +0900, ITAGAKI Takahiro wrote: > "Takayuki Tsunakawa" <[EMAIL PROTECTED]> wrote: > > > > If you use Linux, it has very unpleased behavior in fsync(); It locks all > > > metadata of the file being fsync-ed. We have to wait for the completion of > > > fsync when we do rea

Re: [HACKERS] Load distributed checkpoint

2006-12-22 Thread Zeugswetter Andreas ADI SD
> > If you use linux, try the following settings: > > 1. Decrease /proc/sys/vm/dirty_ratio and dirty_background_ratio. You will need to pair this with bgwriter_* settings, else too few pages are written to the os inbetween checkpoints. > > 2. Increase wal_buffers to redule WAL flushing. You

Re: [HACKERS] Load distributed checkpoint

2006-12-22 Thread Takayuki Tsunakawa
l Message - From: "ITAGAKI Takahiro" <[EMAIL PROTECTED]> To: "Takayuki Tsunakawa" <[EMAIL PROTECTED]> Cc: Sent: Friday, December 22, 2006 6:09 PM Subject: Re: [HACKERS] Load distributed checkpoint "Takayuki Tsunakawa" <[EMAIL PROTECTED]> wrot

Re: [HACKERS] Load distributed checkpoint

2006-12-22 Thread Takayuki Tsunakawa
From: "Greg Smith" <[EMAIL PROTECTED]> > This is actually a question I'd been meaning to throw out myself to this > list. How hard would it be to add an internal counter to the buffer > management scheme that kept track of the current number of dirty pages? > I've been looking at the bufmgr code l

Re: [HACKERS] Load distributed checkpoint

2006-12-22 Thread ITAGAKI Takahiro
"Takayuki Tsunakawa" <[EMAIL PROTECTED]> wrote: > (1) Default case(this is show again for comparison and reminder) > 235 80 226 77 240 > (2) Default + WAL 1MB case > 302 328 82 330 85 > (3) Default + wal_sync_method=open_sync case > 162 67 176 67 164 > (4) (2)+(3) case > 322 350 85

Re: [HACKERS] Load distributed checkpoint

2006-12-22 Thread Takayuki Tsunakawa
From: Inaam Rana > Which IO Shceduler (elevator) you are using? Elevator? Sorry, I'm not familiar with the kernel implementation, so I don't what it is. My Linux distribution is Red Hat Enterprise Linux 4.0 for AMD64/EM64T, and the kernel is 2.6.9-42.ELsmp. I probably havn't changed any kernel

Re: [HACKERS] Load distributed checkpoint

2006-12-21 Thread Inaam Rana
On 12/22/06, Takayuki Tsunakawa <[EMAIL PROTECTED]> wrote: From: "Takayuki Tsunakawa" <[EMAIL PROTECTED]> > (5) (4) + /proc/sys/vm/dirty* tuning > dirty_background_ratio is changed from 10 to 1, and dirty_ratio is > changed from 40 to 4. > > 308 349 84 349 84 Sorry, I forgot to include the

Re: [HACKERS] Load distributed checkpoint

2006-12-21 Thread Takayuki Tsunakawa
TECTED]> Cc: Sent: Friday, December 22, 2006 3:20 PM Subject: Re: [HACKERS] Load distributed checkpoint > Hello, Itagaki-san, > > Thank you for an interesting piece of information. > > From: "ITAGAKI Takahiro" <[EMAIL PROTECTED]> >> If you use linux, try the

Re: [HACKERS] Load distributed checkpoint

2006-12-21 Thread Greg Smith
On Wed, 20 Dec 2006, Inaam Rana wrote: Talking of bgwriter_* parameters I think we are missing a crucial internal counter i.e. number of dirty pages. How much work bgwriter has to do at each wakeup call should be a function of total buffers and currently dirty buffers. This is actually a que

Re: [HACKERS] Load distributed checkpoint

2006-12-21 Thread Takayuki Tsunakawa
Hello, Itagaki-san, Thank you for an interesting piece of information. From: "ITAGAKI Takahiro" <[EMAIL PROTECTED]> > If you use linux, try the following settings: > 1. Decrease /proc/sys/vm/dirty_ratio and dirty_background_ratio. > 2. Increase wal_buffers to redule WAL flushing. > 3. Set wal_

Re: [HACKERS] Load distributed checkpoint

2006-12-21 Thread ITAGAKI Takahiro
"Takayuki Tsunakawa" <[EMAIL PROTECTED]> wrote: > > For pg, half RAM for shared_buffers is too much. The ratio is good for > > other db software, that does not use the OS cache. > > What percentage of RAM is recommended for shared buffers in general? > 40%? 30%? Or, is the general recommendati

Re: [HACKERS] Load distributed checkpoint

2006-12-21 Thread Takayuki Tsunakawa
"Zeugswetter Andreas ADI SD" <[EMAIL PROTECTED]> To: "Takayuki Tsunakawa" <[EMAIL PROTECTED]>; "ITAGAKI Takahiro" <[EMAIL PROTECTED]> Cc: Sent: Thursday, December 21, 2006 11:04 PM Subject: RE: [HACKERS] Load distributed checkpoint > > You were run

Re: [HACKERS] Load distributed checkpoint

2006-12-21 Thread Zeugswetter Andreas ADI SD
> > You were running the test on the very memory-depend machine. > >> shared_buffers = 4GB / The scaling factor is 50, 800MB of data. > > Thet would be why the patch did not work. I tested it with DBT-2, 10GB of > > data and 2GB of memory. Storage is always the main part of performace here, > > ev

Re: [HACKERS] Load distributed checkpoint

2006-12-21 Thread Martijn van Oosterhout
On Thu, Dec 21, 2006 at 06:46:36PM +0900, ITAGAKI Takahiro wrote: > > Oh, really, what an evil fsync is! Yes, I sometimes saw a backend > > waiting for lseek() to complete when it committed. But why does the > > backend which is syncing WAL/pg_control have to wait for syncing the > > data file?

Re: [HACKERS] Load distributed checkpoint

2006-12-21 Thread Takayuki Tsunakawa
To: "Takayuki Tsunakawa" <[EMAIL PROTECTED]> Cc: Sent: Thursday, December 21, 2006 6:46 PM Subject: Re: [HACKERS] Load distributed checkpoint > From: "ITAGAKI Takahiro" <[EMAIL PROTECTED]> > "Takayuki Tsunakawa" <[EMAIL PROTECTED]> wrote: &

Re: [HACKERS] Load distributed checkpoint

2006-12-21 Thread ITAGAKI Takahiro
"Takayuki Tsunakawa" <[EMAIL PROTECTED]> wrote: > > If you use Linux, it has very unpleased behavior in fsync(); It locks all > > metadata of the file being fsync-ed. We have to wait for the completion of > > fsync when we do read(), write(), and even lseek(). > > Oh, really, what an evil fsync

Re: [HACKERS] Load distributed checkpoint

2006-12-21 Thread Takayuki Tsunakawa
From: "ITAGAKI Takahiro" <[EMAIL PROTECTED]> > You were running the test on the very memory-depend machine. >> shared_buffers = 4GB / The scaling factor is 50, 800MB of data. > Thet would be why the patch did not work. I tested it with DBT-2, 10GB of > data and 2GB of memory. Storage is always the

Re: [HACKERS] Load distributed checkpoint

2006-12-20 Thread ITAGAKI Takahiro
"Takayuki Tsunakawa" <[EMAIL PROTECTED]> wrote: > I have to report a sad result. Your patch didn't work. Let's > consider the solution together. What you are addressing is very > important for the system designers in the real world -- smoothing > response time. You were running the test on th

Re: [HACKERS] Load distributed checkpoint

2006-12-20 Thread Takayuki Tsunakawa
On 12/20/06, Takayuki Tsunakawa <[EMAIL PROTECTED]> wrote: > > [Conclusion] > > I believe that the problem cannot be solved in a real sense by > > avoiding fsync/fdatasync(). We can't ignore what commercial databases > > have done so far. The kernel does as much as he likes when PostgreSQL > > re

Re: [HACKERS] Load distributed checkpoint

2006-12-20 Thread Inaam Rana
On 12/20/06, Takayuki Tsunakawa <[EMAIL PROTECTED]> wrote: [Conclusion] I believe that the problem cannot be solved in a real sense by avoiding fsync/fdatasync(). We can't ignore what commercial databases have done so far. The kernel does as much as he likes when PostgreSQL requests him to fsy

Re: [HACKERS] Load distributed checkpoint

2006-12-20 Thread Martijn van Oosterhout
On Wed, Dec 20, 2006 at 09:14:50PM +0900, Takayuki Tsunakawa wrote: > > That implies that fsyncing a datafile blocks fsyncing the WAL. That > > seems terribly unlikely (although...). What OS/Kernel/Filesystem is > > this. I note a sync bug in linux for ext3 that may have relevence. > > Oh, really?

Re: [HACKERS] Load distributed checkpoint

2006-12-20 Thread Takayuki Tsunakawa
> That implies that fsyncing a datafile blocks fsyncing the WAL. That > seems terribly unlikely (although...). What OS/Kernel/Filesystem is > this. I note a sync bug in linux for ext3 that may have relevence. Oh, really? What bug? I've heard that ext3 reports wrong data to iostat when it perform

Re: [HACKERS] Load distributed checkpoint

2006-12-20 Thread Martijn van Oosterhout
On Wed, Dec 20, 2006 at 08:10:56PM +0900, Takayuki Tsunakawa wrote: > One question is the disk utilization. While bgwriter is fsync()ing, > %util of WAL disk drops to almost 0. But the the bandwidth of > Ultra320 SCSI does not appear to be used fully. Any idea? That implies that fsyncing a data

Re: [HACKERS] Load distributed checkpoint

2006-12-20 Thread Takayuki Tsunakawa
Hello, Itagaki-san, all I have to report a sad result. Your patch didn't work. Let's consider the solution together. What you are addressing is very important for the system designers in the real world -- smoothing response time. Recall that unpatched PostgreSQL showed the following tps's in c

Re: [HACKERS] Load distributed checkpoint

2006-12-19 Thread Takayuki Tsunakawa
Hello, Itagaki-san > I posted a patch to PATCHES. Please try out it. Really!? I've just joined pgsql-patches. When did you post it, yesterday? I couldn't find the patch in the following page which lists the mails to pgsql-patches of this month: http://archives.postgresql.org/pgsql-patches/200

Re: [HACKERS] Load distributed checkpoint

2006-12-19 Thread ITAGAKI Takahiro
"Takayuki Tsunakawa" <[EMAIL PROTECTED]> wrote: > I performed some simple tests, and I'll show the results below. > (1) The default case > 235 80 226 77 240 > (2) No write case > 242 250 244 253 280 > (3) No checkpoint case > 229 252 256 292 276 > (4) No fsync() case > 236 112 215 2

Re: [HACKERS] Load distributed checkpoint

2006-12-13 Thread Jim C. Nasby
On Wed, Dec 13, 2006 at 06:27:38PM +0900, Takayuki Tsunakawa wrote: > No. BgBufferSync() correctly keeps track of the position to restart > scanning at. bufid1 is not initialized to 0 every time BgBufferSync() > is called, because bufid1 is a static local variable. Please see the > following code.

Re: [HACKERS] Load distributed checkpoint

2006-12-13 Thread Takayuki Tsunakawa
Hello, From: "Jim C. Nasby" <[EMAIL PROTECTED]> Also, I have a dumb question... BgBufferSync uses buf_id1 to keep track > of what buffer the bgwriter_all scan is looking at, which means that > it should remember where it was at the end of the last scan; yet it's > initialized to 0 every time BgBu

Re: [HACKERS] Load distributed checkpoint

2006-12-12 Thread Jim C. Nasby
On Fri, Dec 08, 2006 at 11:43:27AM -0500, Tom Lane wrote: > "Kevin Grittner" <[EMAIL PROTECTED]> writes: > > "Jim C. Nasby" <[EMAIL PROTECTED]> wrote: > >> Generally, I try and configure the all* settings so that you'll get 1 > >> clock-sweep per checkpoint_timeout. It's worked pretty well, but I

Re: [HACKERS] Load distributed checkpoint

2006-12-12 Thread Tom Lane
Gregory Stark <[EMAIL PROTECTED]> writes: > It's a fundamental shift in the idea of the purpose of bgwriter. Instead of > trying to suck i/o away from the subsequent checkpoint it would be responsible > for all the i/o of the previous checkpoint which would still be in progress > for the entire tim

Re: [HACKERS] Load distributed checkpoint

2006-12-12 Thread Gregory Stark
> Tom Lane wrote: >> >> I like Kevin's settings better than what Jim suggests. If the bgwriter >> only makes one sweep between checkpoints then it's hardly going to make >> any impact at all on the number of dirty buffers the checkpoint will >> have to write. The point of the bgwriter is to red

Re: [HACKERS] Load distributed checkpoint

2006-12-12 Thread Bruce Momjian
I have thought a while about this and I have some ideas. Ideally, we would be able to trickle the sync of individuals blocks during the checkpoint, but we can't because we rely on the kernel to sync all dirty blocks that haven't made it to disk using fsync(). We could trickle the fsync() calls,

Re: [HACKERS] Load distributed checkpoint

2006-12-12 Thread Kevin Grittner
>>> On Tue, Dec 12, 2006 at 3:22 AM, in message <[EMAIL PROTECTED]>, "Zeugswetter Andreas ADI SD" <[EMAIL PROTECTED]> wrote: >> > One thing I do worry about is if both postgresql and the OS >> > are both delaying write()s in the hopes of collapsing them >> > at the same time. If so, this would

Re: [HACKERS] Load distributed checkpoint

2006-12-12 Thread Zeugswetter Andreas ADI SD
> > One thing I do worry about is if both postgresql and the OS > > are both delaying write()s in the hopes of collapsing them > > at the same time. If so, this would cause both to be experience > > bigger delays than expected, and make checkpoints worse. > > That is my concern. Letting 30 sec

Re: [HACKERS] Load distributed checkpoint

2006-12-11 Thread Kevin Grittner
>>> On Mon, Dec 11, 2006 at 3:31 PM, in message <[EMAIL PROTECTED]>, Ron Mayer <[EMAIL PROTECTED]> wrote: > > One thing I do worry about is if both postgresql and the OS > are both delaying write()s in the hopes of collapsing them > at the same time. If so, this would cause both to be experienc

Re: [HACKERS] Load distributed checkpoint

2006-12-11 Thread Ron Mayer
ITAGAKI Takahiro wrote: > "Kevin Grittner" <[EMAIL PROTECTED]> wrote: > >> ...the file system cache will collapse repeated writes to the >> same location until things settle ... >> If we just push dirty pages out to the OS as soon as possible, >> and let the file system do its job, I think we're

Re: [HACKERS] Load distributed checkpoint

2006-12-11 Thread Inaam Rana
I wonder how the other big DBMS, IBM DB2, handles this. Is Itagaki-san referring to DB2? DB2 would also open data files with O_SYNC option and page_cleaners (counterparts of bgwriter) would exploit AIO if available on the system. Inaam Rana EnterpriseDB http://www.enterprisedb.com

Re: [HACKERS] Load distributed checkpoint

2006-12-11 Thread Takayuki Tsunakawa
; Cc: "ITAGAKI Takahiro" <[EMAIL PROTECTED]>; Sent: Monday, December 11, 2006 6:30 PM Subject: Re: [HACKERS] Load distributed checkpoint > On Fri, 2006-12-08 at 11:05 +0900, Takayuki Tsunakawa wrote: >> I understand that checkpoints occur during crash >> recovery

Re: [HACKERS] Load distributed checkpoint

2006-12-11 Thread Takayuki Tsunakawa
Hello, From: "ITAGAKI Takahiro" <[EMAIL PROTECTED]> "Takayuki Tsunakawa" <[EMAIL PROTECTED]> wrote: >> I'm afraid it is difficult for system designers to expect steady >> throughput/response time, as long as PostgreSQL depends on the >> flushing of file system cache. How does Oracle provide stable

Re: [HACKERS] Load distributed checkpoint

2006-12-11 Thread ITAGAKI Takahiro
"Kevin Grittner" <[EMAIL PROTECTED]> wrote: > We have not experience any increase in I/O, just a smoothing. Keep in > mind that the file system cache will collapse repeated writes to the > same location until things settle, and the controller's cache also has a > chance of doing so. If we just

Re: [HACKERS] Load distributed checkpoint

2006-12-11 Thread Simon Riggs
On Fri, 2006-12-08 at 11:05 +0900, Takayuki Tsunakawa wrote: > I understand that checkpoints occur during crash > recovery and PITR, so time for those operations would get longer. A restorepoint happens during recovery, not a checkpoint. The recovery is merely repeating the work of the checkpoint