Hi,

I think I have found a relatively cheap solution to
the consistancy problem we encountered with having
multiple writable mappings to the same page.

The solution involves Ben's reverse PTE lookup (which
will have to support multiply mapped pages) and the
idea that we will have to limit the amount of dirty
pages in memory anyhow.

Say we start out with a clean page that is mapped by
3 processes, one of them dirties the page under us and
we encounter it on a page scan.

1. we mark the page dirty
2. we queue the page for writing in a buffer head
   and mark the page and buffer as locked
3. we make the mapping read-only for all processes,
   due to the lock they will block until step 4 is
   done
4. we unlock the buffer, making it possible to write
   to disk
5. if a process dirties the page before we get a chance
   to flush it to disk, we clone the old page and point
   the buffer head to the old version
6. if we write out the page before one of the processes
   dirties it, we remove the special status on the page

I know this could involve an extra copy of the page for
writing, but it fits in really well with the transactioning
scheme and could, with the proper API, give us a way to
implement a high-performance transactioning interface for
userspace. Keeping the transactioning and the syncing 
completely separated will give userspace the option of
consistancy and performance or consistancy and minimal
data loss. The database folks will probably love this :)

regards,

Rik
--
The Internet is not a network of computers. It is a network
of people. That is its real strength.

Reply via email to