On Mon, 13.10.14 08:44, Alexander Larsson ([email protected]) wrote:

> In some sense it is unavoidable. We have to tie the exact file data to
> the signature. However, does this mean we have to shove random bits at
> the kernel rather than going through the syscall interface?
> 
> btrfs-receive is a userspace tool that uses the regular userspace i/o
> syscalls to do its modifications. How does this propose to handle the
> signatures? If it can do it, why would it not be possible to do
> ourselves?

Sure, it's possible to implement our own btrfs send/recv
implementation in userspace. 

At LPC we sat down with Chris Mason about this, and it's certainly an
option for us, the code for serializing/deserializing things is
supposedly not that difficult.

> > Also, the hardlink farms are certainly not pretty.
> 
> They are not pretty, sure. However they are very widely available, and
> the *only* solution that allows page-cache sharing between images, and
> "trivial" deduplication between unrelated images. I don't think we
> should to easily dismiss it.

So, we asked Chris about dedup. He basically said that online dedup is
there, and will be done implicitly when you do btrfs recv hence. Or in
other words, dedup is really nothing we need to actviely think about
if we use btrfs, it's just there.

> > Harald has been playing around with some build logic that makes sure
> > that rebuilt app updates are efficiently shipped as btrfs send/recv,
> > with stable inode numbers and stuff.
> 
> How exactly do you envision this would work in practice for updates? Say
> you have an application that receives regular updates (major and minor).
> At any time the user comes in an does a fetch-from-scratch, or an update
> between two essentially "random" versions.  What does the server store?
> A copy of each full image? Only for major versions? Delta inbetween each
> consecutive image? Delta between each possible image pair?

Well, it could certainly generate the diffs on the fly, by looking at
the actual btrfs volumes with their subvolumes. However, I'd assume
we'd pre-generate relevant deltas in advance, maybe in logarithmic
increasing distances.

> > You know, this is explicitly something where we shouldn't reinvent the
> > wheel. It's quite frankly crazy to come up with a new serialization
> > format, that contains per-file verification data, that then somehow
> > can be deserialized on some destination system again back into the fs
> > layer...
> 
> The hard part obviously having the kernel verify the signatures, that
> requires deep kernel FS works, which doesn't exist yet, and only the
> btrfs people are working on. However, when they come up with something
> it could very well be that it can be used for other things than
> btrfs-recive (as btrfs-recive is just essentially a stream of syscalls).
> Is the design discussions on this happening in the open somewhere?

Yes, we had a couple of phone calls with Chris in the past and met
with him at LPC about this. But we are not involved in the actual
implementation of this, we just make sure we are in sync regarding our
requirements. 

Facebook's requirements and ours are thankfully not too far off. While
they only care for the verified OS, we also want to solve things more
generically.

> > I know that the Red Hat fs crew hates btrfs like it was the devil, and
> > loves LVM/DM like it was a healthy project. But yuck, just yuck!
> 
> I'm not particularly fond of a device-mapper approach either, but I was
> listing all options, so it needed to be in there. That said, I'm also a
> btrfs user on all my development machines, and I can't say my experience
> with it has been exactly stellar...

Well, true. But again, this won't change unless we actually push it
out to people. And I am very sure that doing this this way is a pretty
nice way, since we will initially only store redundant data in it that
we access only for read pretty much. btrfs really should handle that,
and even if it didn't we can easily reconstruct everything by
downloading the image again.

Lennart

-- 
Lennart Poettering, Red Hat
_______________________________________________
gnome-os-list mailing list
[email protected]
https://mail.gnome.org/mailman/listinfo/gnome-os-list

Reply via email to