Greg Stark wrote:
Jan Wieck [EMAIL PROTECTED] writes:
The whole sync() vs. fsync() discussion is in my opinion nonsense at
this
point. Without the ability to limit the amount of files to a reasonable
number,
by employing tablespaces in the form of larger container files, the
risk of
Greg Stark wrote:
Jan Wieck [EMAIL PROTECTED] writes:
The whole sync() vs. fsync() discussion is in my opinion nonsense at this
point. Without the ability to limit the amount of files to a reasonable number,
by employing tablespaces in the form of larger container files, the risk of
forcing
Jan Wieck wrote:
Tom Lane wrote:
Zeugswetter Andreas SB SD [EMAIL PROTECTED] writes:
So Imho the target should be to have not much IO open for the checkpoint,
so the fsync is fast enough, even if serial.
The best we can do is push out dirty pages with write() via the bgwriter
and
Kevin == Kevin Brown [EMAIL PROTECTED] writes:
The bigger problem though with this is that it makes the
problem of list overflow much worse. The hard part about
shared memory management is not so much that the available
space is small, as that the available space is fixed
Tom Lane wrote:
The best idea I've heard so far is the one about sync() followed by
a bunch of fsync()s. That seems to be correct, efficient, and dependent
only on very-long-established Unix semantics.
Agreed.
--
Bruce Momjian| http://candle.pha.pa.us
[EMAIL
Tom Lane wrote:
You can only fsync one FD at a time (too bad ... if there were a
multi-file-fsync API it'd solve the overspecified-write-ordering issue).
What about aio_fsync()?
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands
Florian Weimer [EMAIL PROTECTED] writes:
Tom Lane wrote:
You can only fsync one FD at a time (too bad ... if there were a
multi-file-fsync API it'd solve the overspecified-write-ordering issue).
What about aio_fsync()?
(1) it's unportable; (2) it's not clear that it's any improvement over
Bruce Momjian wrote:
Jan Wieck wrote:
Tom Lane wrote:
Zeugswetter Andreas SB SD [EMAIL PROTECTED] writes:
So Imho the target should be to have not much IO open for the checkpoint,
so the fsync is fast enough, even if serial.
The best we can do is push out dirty pages with write() via the
Jan Wieck [EMAIL PROTECTED] writes:
The whole sync() vs. fsync() discussion is in my opinion nonsense at
this point.
The sync vs fsync discussion is not about performance, it is about
correctness. You can't simply dismiss the fact that we don't know
whether a checkpoint is really complete
Jan Wieck [EMAIL PROTECTED] writes:
The whole sync() vs. fsync() discussion is in my opinion nonsense at this
point. Without the ability to limit the amount of files to a reasonable number,
by employing tablespaces in the form of larger container files, the risk of
forcing excessive head
Jan Wieck [EMAIL PROTECTED] writes:
Doing this is not just what you call it. In a system with let's say 500
active backends on a database with let's say 1000 things that are
represented as a file, you'll need half a million virtual file descriptors.
[shrug] We've been dealing with virtual
I wrote:
But that someplace else
could easily be a process forked by the backend in question whose sole
purpose is to go through the list of files generated by its parent backend
and fsync() them. The backend can then go about its business and upon
receipt of the SIGCHLD notify anyone that
Kevin Brown wrote:
I have no idea whether or not this approach would work in Windows.
The win32 API has ReadFileScatter/WriteFileScatter, which was developed
to handle these types of problems. These two functions were added for
the sole purpose of making SQL server run faster. They are always
Tom Lane wrote:
Zeugswetter Andreas SB SD [EMAIL PROTECTED] writes:
So Imho the target should be to have not much IO open for the checkpoint,
so the fsync is fast enough, even if serial.
The best we can do is push out dirty pages with write() via the bgwriter
and hope that the kernel will see
Tom Lane wrote:
Kevin Brown [EMAIL PROTECTED] writes:
Well, running out of space in the list isn't that much of a problem. If
the backends run out of list space (and the max size of the list could
be a configurable thing, either as a percentage of shared memory or as
an absolute size),
I don't think the bgwriter is going to be able to keep up with I/O bound
backends, but I do think it can scan and set those booleans fast enough
for the backends to then perform the writes.
As long as the bgwriter does not do sync writes (which it does not,
since that would need a whole lot
Zeugswetter Andreas SB SD [EMAIL PROTECTED] writes:
So Imho the target should be to have not much IO open for the checkpoint,
so the fsync is fast enough, even if serial.
The best we can do is push out dirty pages with write() via the bgwriter
and hope that the kernel will see fit to write
On Thursday 05 February 2004 20:24, Tom Lane wrote:
Zeugswetter Andreas SB SD [EMAIL PROTECTED] writes:
So Imho the target should be to have not much IO open for the checkpoint,
so the fsync is fast enough, even if serial.
The best we can do is push out dirty pages with write() via the
Shridhar Daithankar [EMAIL PROTECTED] writes:
There are other benefits of writing pages earlier even though they might not
get synced immediately.
Such as?
It would tell kernel that this is latest copy of updated buffer. Kernel VFS
should make that copy visible to every other backend as
People keep saying that the bgwriter mustn't write pages synchronously
because it'd be bad for performance, but I think that analysis is
faulty. Performance of what --- the bgwriter? Nonsense, the *point*
Imho that depends on the workload. For a normal OLTP workload this is
certainly
I am concerned that the bgwriter will not be able to keep up with the
I/O generated by even a single backend restoring a database, let alone a
busy system. To me, the write() performed by the bgwriter, because it
is I/O, will typically be the bottleneck on any system that is I/O bound
Tom Lane wrote:
Kevin Brown [EMAIL PROTECTED] writes:
Instead, have each backend maintain its own separate list in shared
memory. The only readers of a given list would be the backend it belongs
to and the bgwriter, and the only time bgwriter attempts to read the
list is at checkpoint
Kevin Brown [EMAIL PROTECTED] writes:
Tom Lane wrote:
The more finely you slice your workspace, the more likely it becomes
that one particular part will run out of space. So the inefficient case
where a backend isn't able to insert something into the appropriate list
will become considerably
Bruce Momjian wrote:
Here is my new idea. (I will keep throwing out ideas until I hit on a
good one.) The bgwriter it going to have to check before every write to
determine if the file is already recorded as needing fsync during
checkpoint. My idea is to have that checking happen during the
Kevin Brown [EMAIL PROTECTED] writes:
Instead, have each backend maintain its own separate list in shared
memory. The only readers of a given list would be the backend it belongs
to and the bgwriter, and the only time bgwriter attempts to read the
list is at checkpoint time.
The sum total
Tom Lane wrote:
What I've suggested before is that the bgwriter process can keep track
of all files that it's written to since the last checkpoint, and fsync
them during checkpoint (this would likely require giving the checkpoint
task to the bgwriter instead of launching a separate process for
Bruce Momjian [EMAIL PROTECTED] writes:
The trick is to somehow record all files modified since the last
checkpoint, and open/fsync/close each one. My idea is to stat() each
file in each directory and compare the modify time to determine if the
file has been modified since the last
Tom Lane wrote:
Bruce Momjian [EMAIL PROTECTED] writes:
The trick is to somehow record all files modified since the last
checkpoint, and open/fsync/close each one. My idea is to stat() each
file in each directory and compare the modify time to determine if the
file has been modified
Bruce Momjian [EMAIL PROTECTED] writes:
Any ideas on how to record the
modified files without generating tones of output or locking contention?
What I've suggested before is that the bgwriter process can keep track
of all files that it's written to since the last checkpoint, and fsync
them
29 matches
Mail list logo