> > Async/Overlapped IO allows for IO on any number of file blocks (aka pages) > without limit to their locations, consecutive or not. > > You words "single operation for any storage device" make me think that > you are referring to a single OS call. There is no such API in "only platform > that matters" which allows to post many IO requests at one call (*nix have it, > btw). Note, writte of 10 consecutive pages at one call to the WriteFileXXX is > still single IO request from the application (engine) POV.
Windows does have the ability to post many IO requests in a single call -- see Overlapped IO structure. Regarding "consecutive pages" do you mean that the pages fall one after the other, or that 10 pages are written consecutively? > >>> I don't see how database integrity can be maintained if the header > >>> page > >> changes are not persisted to disk immediately -- aside from an MPI > >> based multi-node cluster where pages changes are sent to other nodes > >> (as witnesses for safekeeping)*. > >> > >> The idea is to defer header page write up to the write of the any > >> other page in a hope that other transactions could start in between. > > > > But if the write is deferred to the start of the next transaction... > > Header page write is deferred to the write of any other page, not to the > start of the next transaction > > > How would the database know what that the last committed transaction > was? > > If few transactions started and increment Next counter in memory only, > and there was no page writes in the mean time - what the problem ? No > transaction committed were (as there was no page writes). Nobody ever > know that transactions exists (except of its users). There is no visible > effects > in the database for the other users. What does page writes have to do with transaction commits? What about the data changes that those transactions could apply to database pages, before the transaction is committed? They would be written to disk, no? If the engine dies before those transactions are committed. When the engine restarts, how would it cleanup the incomplete changes? > > > > Wouldn't this detail be required on server restart, if the server > abended/was killed right after the transaction write? > > What detail ? Last committed tx ? It is fixed in TIP and requires page > write > which will force Header page to disk before TIP page will be written. But you are proposing to delay the Header pager write no? If the Header needs to be written before the TIP can be written and the TIP provides the details about the last committed Tx, how would it be possible to defer Header writes? Or by "Defer" do you mean really -- Header would only be written on transaction commits? (That start transaction would no longer cause a page write) If so, then that would be a could good thing (still concerned about data from incomplete transactions). > > In this way, the main server would not need to wait for Write IOs. > > It is not possible to completely remove needs to wait. One need to wait for > completion of page write before mark this page as dirty again. I.e. writers > must wait for each other. I was making a distinction between waiting for the other nodes to acknowledge the receipt of the page (which can be very fast) and the need to wait for the page to be actually written to storage (slower). Writers need only wait for the other nodes to ACK. Sean ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot Firebird-Devel mailing list, web interface at https://lists.sourceforge.net/lists/listinfo/firebird-devel