On Mon, 2007-03-12 at 17:46 -0700, Jeff Davis wrote: > On Mon, 2007-03-12 at 13:21 +0000, Simon Riggs wrote: > > So based on those thoughts, sync_scan_offset should be fixed at 16, > > rather than being variable. In addition, ss_report_loc() should only > > report its position every 16 blocks, rather than do this every time, > > which will reduce overhead of this call. > > If we fix sync_scan_offset at 16, we might as well just get rid of it. > Sync scans are only useful on large tables, and getting a free 16 pages > over a scan isn't worth the trouble. However, even without > sync_scan_offset,
Not sure what you mean by "a free 16 pages". Please explain? > sync scans are still a valuable feature. I have always thought synch scans are a valuable feature too. > I agree that ss_report_loc() doesn't need to report on every call. If > there's any significant overhead I agree that it should report less > often. Do you think that the overhead is significant on such a simple > function? Lets try without it and see. There's no need to access shared memory so often. > I like the idea of reducing tuning parameters, but we should, at a > minimum, still allow an on/off button for sync scans. My tests revealed > that the wrong combination of OS/FS/IO-Scheduler/Controller could result > in bad I/O behavior. Agreed > > If need be, the value of scan_recycle_buffers can be varied upwards > > should the scans drift apart, as a way of bringing them back together. > > If the scans aren't being brought together, that means that one of the > scans is CPU bound or outside the combined cache trail (shared_buffers > + OS buffer cache). > > > We aren't tracking whether they are together or apart, so I would like > > to see some debug output from synch scans to allow us to assess how far > > behind the second scan is as it progresses. e.g. > > LOG: synch scan currently on block N, trailing pathfinder by M blocks > > issued every 128 blocks as we go through the scans. > > > > Thoughts? > > > > It's hard to track where all the scans are currently. One of the > advantages of my patch is its simplicity: the scans don't need to know > about other specific scans, and there is no concept in the code of a > "head" scan or a "pack". I'd still like to be able to trace each scan to see how far ahead/behind it is from the other scans on the same table, however we do that. Any backend can read the position of other backend's scans, so it should be easy enough to put in a regular LOG entry that shows how far ahead/behind they are from other scans. We can trace just one backend and have it report on where it is with respect to other backends, or you could have them all calculate their position and have just the lead scan report the position of all other scans. > There is no easy way to tell which scan is ahead and which is behind. > There was a discussion when I submitted this proposal at the beginning > of 8.3, but I didn't see enough benefit to justify all of the costs and > risks associated with scans communicating between eachother. I > certainly can't implement that kind of thing before feature freeze, and > I think there's a risk of lock contention for the communication > required. I'm also concerned that -- if the scans are too > interdependent -- it would make postgres less robust against the > disappearance of a single backend (i.e. what if the backend that is > leading a scan dies?). I've not mentioned that again. I'd like to see the trace option to allow us to tell whether its working as well as we'd like it to pre-release and in production. Also I want to see whether various settings of scan_recycle_buffers help/hinder the effectiveness of synch scans, as others have worried it might. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---------------------------(end of broadcast)--------------------------- TIP 4: Have you searched our list archives? http://archives.postgresql.org