On Tue, Jan 26, 2016 at 10:01 AM, Thomas Leonard <[email protected]> wrote:

> On 26 January 2016 at 03:43, Tim Cuthbertson <[email protected]> wrote:
> > Hi all,
> >
> > I've got a small HTTP service which runs both as a normal unix server
> > which I've slowly been porting to work on mirage as well. One factor I
> > haven't found a solution for yet is atomic FS operations.
> >
> > The unix version relies on Unix.rename being atomic - I don't have
> > heavy storage needs, so I'm not worried about the performance of
> > writing a new file then renaming over the old one atomically. To get
> > things working with the existing code I implemented a crude `rename`
> > for mirage's FS backend, but obviously it is not atomic:
> >
> >
> https://github.com/timbertson/passe/blob/a07ca8fe5a3a6fe0803df4765078749310c3df5c/src/mirage/unikernel.ml#L10
> >
> > I believe that irmin does various things atomically, but my needs are
> > simple enough that converting my simple, flat-file storage to use
> > irmin as a backend instead seems like a lot of unnecessary work (and I
> > really don't need branching or history).
> >
> > Does anyone know of a simple way to do persistent writes to the FS in
> > mirage that ensure atomicity of written data? Peraps folks with
> > knowledge of how irmin achieves atomicity can let me in on the base FS
> > operations that it uses to do that?
>
> There are two separate things here:
>
> 1. Making operations appear atomic to the application (i.e. it will
> never see a half-written file).
>
> 2. Ensuring that if the system crashes and is restarted it will return
> to some state it had previously had.
>
> (1) is pretty easy in a Unikernel since you can just make everything
> go via your own wrapper, e.g. with a Lwt_mutex around it. (2) is
> hard...
>
> Irmin ensures (1), but relies on the FS layer for (2). On Unix, it
> uses POSIX atomic rename. I don't think we have Irmin persistence
> working on Xen yet (maybe someone hacked it up with FAT, but it
> probably wasn't robust against crashes if so).
>
> I believe the current plan is to get this finished and working:
>
>   https://github.com/djs55/ocaml-btree
>
> Not sure how close it is to being ready. Perhaps Dave can comment...
>

It's not ready for use yet unfortunately :( Realistically it'll take a
couple of months (unless I suddenly find an excuse to spend more of my time
on it).

I'm hoping that an update will consist of constructing a new tree on disk
sharing as many nodes as possible with the current tree (wrapped by a mutex
to guarantee (1)) and then flipping the root pointer in an atomic sector
write to guarantee (2). I need to make sure blocks don't leak and can be
efficiently GCed too. None of this is fully implemented yet.

Cheers,
Dave


> > Just thinking about it without any particular knowledge of xen or the
> > block storage, it seems like I could get away with three files for
> > each real file:
> >
> > <file>.ptr -> contents is just "a" or "b"
> > <file>.a
> > <file>.b
> >
> > Upon read, get the current "active" from <file>.ptr then read that.
> > Upon write, get the current active from <file>.ptr, write to the
> > _inactive_ file and then overwrite the byte in file.ptr to make it
> > active. I'd protect file writes with a process-level lock (to make
> > sure multiple writers don't conflict), which is sufficient for the
> > mirage backend since there's no multi-process concerns.
> >
> > Would that work? Are single-byte writes guaranteed to be atomic in the
> > FAT FS backend? Any better ideas?
>
> I vaguely recall that the FAT code builds up a list of blocks to write
> and passes them all together to a function. You could perhaps write
> them to a journal partition first. I suspect that trying to build
> anything reliable on top of FAT is a lost cause however...
>
>
> --
> Dr Thomas Leonard        http://roscidus.com/blog/
> GPG: DA98 25AE CAD0 8975 7CDA  BD8E 0713 3F96 CA74 D8BA
>
> _______________________________________________
> MirageOS-devel mailing list
> [email protected]
> http://lists.xenproject.org/cgi-bin/mailman/listinfo/mirageos-devel
>



-- 
Dave Scott
_______________________________________________
MirageOS-devel mailing list
[email protected]
http://lists.xenproject.org/cgi-bin/mailman/listinfo/mirageos-devel

Reply via email to