Re: Revprop caching 'n stuff

Stefan Fuhrmann Mon, 12 Mar 2012 02:45:32 -0700

On 09.03.2012 17:12, Philip Martin wrote:

Stefan Fuhrmann<[email protected]>  writes:

* new definition: generations must be even numbered
* writer stores timeout value (e.g. now() + 10s)
* writer increments generation ->  odd number
   (if the result is an even number, there are concurrent
    writes or one process crashed; increment until we
    reach an odd generation)
* writer replaces revprop file
* writer increments the generation until it is even again

* reader gets current generation from shm
* if even ->  proceed, a write may or  may not be in progress
* if odd ->  a writer *might* have been stalled / aborted
* if timeout>  now() ->  proceed with (gen-1) for lookups,
   the writer may still run
* timeout expired ->  increase the generation until it is even
   (causes everybody to re-read revprops, if writer is still
   alive, it will increase the value further)

So, in case of an aborted writer process, the other
processes behave like proxies that see outdated data for
a short period of time only.

The critical parameter here is the timeout. It must be large
enough that no "move-into-place" operation could be stalled
longer than that (Q: how is that guaranteed to be atomic /
self-healing in the first place?). Otherwise, a crash between
move-into-place and bumping the generation number
would still cause an undetected change.

I'm not sure exactly what you are trying to implement.

Concurrent writes?  Are you planning to remove the existing revprops
write lock?  That would require a repository format bump.  I think it is
also incompatible with having multiple machines write to the same
repository.


No. I won't change the locking / serialization behavior.

I'm simply not 100% sure (haven't read that part of the
source code) how the next writer will detect an abandoned
lock. Assuming it's simply keeping a file open exclusively,
this is trivial. Otherwise, there might have been some
logic in place that could also trigger the revpro gen bump.

I think you plan to address the problem of how/when to detect that the
read cache is out-of-date after a write by having the readers check the
shared memory on every read.  We could achieve the same by checking the
generation file itself, at the cost of disk IO.


By the cost of a disk I/O for every disk I/O we might
save on revprops. I.e. no total savings at all (unless
we are talking about huge revprop files).

The problem of the current FS API is that we don't
have a larger context that we could attach the gen
check to. Instead, it reads the revprops for a single
revision and this may be called thousands of times
or not at all before the next write access. There is
no way of knowing within the FS_FS code.

What sort of gain is
the shared memory compared to the disk approach?


On many machines (x64, in particular), reading
and even writing the revprop generation is almost
for free. Read is for free with MOESI doing its sync
magic in the background; writing takes a few ns.

   Perhaps it would be
better to implement the non-shared memory option first?  That would also
allow multiple machines to access the repository.

Except for the new API testing code, everything has
been implemented on the branch by now. A review
would be very welcome.

-- Stefan^2.

Re: Revprop caching 'n stuff

Reply via email to