Hi Andrei,

On 2021-03-30 07:38, Andrei Borzenkov wrote:
On 30.03.2021 08:33, Andrei Borzenkov wrote:
On 29.03.2021 22:14, Claudius Heine wrote:
Hi Andrei,

On 2021-03-29 18:30, Andrei Borzenkov wrote:
On 29.03.2021 16:16, Claudius Heine wrote:
Hi,

I am currently investigating the possibility to use `btrfs-stream` files
(generated by `btrfs send`) for deploying a image based update to
systems (probably embedded ones).

One of the issues I encountered here is that btrfs-send does not use any
diff algorithm on files that have changed from one snapshot to the next.


btrfs send works on block level. It sends blocks that differ between two
snapshots.

Are you sure?


Yes.

Ok, sorry for doubting you. My assumptions where wrong.


I did a test with a 32MiB random file. I created one snapshot, then
changed (not deleted or added) one byte in that file and then created a
snapshot again. `btrfs send` created a >32MiB `btrfs-stream` file. If it
would be only block based, then I would have expected that it would just
contain the changed block, not the whole file. And if I use a smaller
file on the same file system, then the `btrfs-stream` is smaller as well.

I looked into those `btrfs-stream` files using [1] and also [2] as well
as the code. While I haven't understood everything there yet, it
currently looks to me like it is file based.


btrfs send is not pure block based image, because it would require two
absolutely identical filesystems. It needs to replicate filesystem
structure so it of course needs to know which files are created/deleted.
But for each file it only sends changed parts since previous snapshot.
This only works if both snapshots refer to the *same* file.


Or more precisely - btrfs send knows which filesystem content was part
of previous snapshot and so is already present on destination and it
will not send this content again. It is actually more or less irrelevant
which files this content belongs to.

I think I understood that now.


As was already mentioned, you need to understand how your files are
changed. In particular, standard tools for software update do not
rewrite files in place - they create new files with new content. From
btrfs perspective they are completely different; two files with the same
name in two snapshots do not share a single byte. When you compute delta
between two snapshots you get instructions to delete old file and create
new file with new content (that will be renamed to the same name as
deleted old file). This also by necessity sends full new content.

As you said, many standard tools create new files instead of updating files in place. But I guess a `dedupe` run before creating the snapshot could help here, right?

If we have a root file system build process that always regenerates all files, and then copies those into a file system, then all files are 'different' from a btrfs perspective.

So yes, btrfs replication is block based; similarity is determined by
how much physical data is shared between two files. And you expect file
based replication where file names determine whether files should be
considered the same and changes are computed for two files with the same
name.

Right. Maybe we could use the file path just as a hint for an opportunity of saving resources by creating block based deltas.

I guess I have to think about this some more.

Thanks a lot!
Claudius

Reply via email to