30.05.2017 19:32, Leyne, Sean wrote:
Async/Overlapped IO allows for IO on any number of file blocks (aka pages)
without limit to their locations, consecutive or not.
You words "single operation for any storage device" make me think that
you are referring to a single OS call. There is no such API in "only platform
that matters" which allows to post many IO requests at one call (*nix have it,
btw). Note, writte of 10 consecutive pages at one call to the WriteFileXXX is
still single IO request from the application (engine) POV.
Windows does have the ability to post many IO requests in a single call -- see
Overlapped IO structure.
I already wrote that one big request to write many consecutive pages is still
one single IO request from app point of view. One OVERLAPPED structure can't
post more than one IO request.
Regarding "consecutive pages" do you mean that the pages fall one after the
other, or that 10 pages are written consecutively?
I don't know what is "page fall", so - yes - i speak about physical order of
pages on disk.
I don't see how database integrity can be maintained if the header
page
changes are not persisted to disk immediately -- aside from an MPI
based multi-node cluster where pages changes are sent to other nodes
(as witnesses for safekeeping)*.
The idea is to defer header page write up to the write of the any
other page in a hope that other transactions could start in between.
But if the write is deferred to the start of the next transaction...
Header page write is deferred to the write of any other page, not to the
start of the next transaction
How would the database know what that the last committed transaction
was?
If few transactions started and increment Next counter in memory only,
and there was no page writes in the mean time - what the problem ? No
transaction committed were (as there was no page writes). Nobody ever
know that transactions exists (except of its users). There is no visible effects
in the database for the other users.
What does page writes have to do with transaction commits?
Transaction on commit (rollback) writes all pages it marks as dirty to disk.
Commit\rollback returns to the user after OS confirmed all such pages are
written
to disk.
What about the data changes that those transactions could apply to database
pages, before the transaction is committed? They would be written to disk, no?
Yes.
If the engine dies before those transactions are committed. When the engine
restarts, how would it cleanup the incomplete changes?
At TIP those transactions will still be marked as active. Engine will detect
it
real state (dead) and undo its changes.
Wouldn't this detail be required on server restart, if the server
abended/was killed right after the transaction write?
What detail ? Last committed tx ? It is fixed in TIP and requires page write
which will force Header page to disk before TIP page will be written.
But you are proposing to delay the Header pager write no?
Yes.
If the Header needs to be written before the TIP can be written and the TIP
provides the details about the last committed Tx, how would it be possible to
defer Header writes?
Currently:
tx1 starts
fetch Header page
tx2 starts
fetch Header page
waiting...
tx1
increment Next
write Header page
release Header page
tx2
...Header page fetched
increment Next
write Header page
release Header page
tx1 commit
write all dirty pages marked by tx1
fetch TIP page
mark tx1 as committed
write TIP page
release TIP page
tx2 commit
write all dirty pages marked by tx1
fetch TIP page
mark tx1 as committed
write TIP page
release TIP page
Will be:
tx1 starts
fetch Header page
tx2 starts
fetch Header page
waiting...
tx1
increment Next
release Header page
tx2
...Header page fetched
increment Next
release Header page
tx1 commit
write all dirty pages marked by tx1
before writting of first dirty page write Header page
fetch TIP page
mark tx1 as committed
write TIP page
release TIP page
tx2 commit
write all dirty pages marked by tx1
-- no need to write Header page
fetch TIP page
mark tx1 as committed
write TIP page
release TIP page
Or by "Defer" do you mean really -- Header would only be written on transaction commits?
If there was no other page write - yes, Header page will be written at commit
(That start transaction would no longer cause a page write)
Yes
If so, then that would be a could good thing (still concerned about data from
incomplete transactions).
Sure, it is good thing ;) Still have concerns ?
In this way, the main server would not need to wait for Write IOs.
It is not possible to completely remove needs to wait. One need to wait for
completion of page write before mark this page as dirty again. I.e. writers
must wait for each other.
I was making a distinction between waiting for the other nodes to acknowledge
the receipt of the page (which can be very fast) and the need to wait for the
page to be actually written to storage (slower).
Writers need only wait for the other nodes to ACK.
We should n't modify memory buffer (page contetnts) while it is handled by OS
write request.
Even with copy-on-write we should wait for the current write request to
complete before
issue another one.
Regards,
Vlad
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Firebird-Devel mailing list, web interface at
https://lists.sourceforge.net/lists/listinfo/firebird-devel