Re: Crazy idea of cleanup the inode_record btrfsck things with SQL?

Austin S Hemmelgarn Mon, 01 Dec 2014 04:53:31 -0800

On 2014-11-30 20:58, Qu Wenruo wrote:

[BACKGROUND]
I'm trying to implement the function to repair missing inode item.
Under that case, inode type must be salvaged(although it can be fallback to
FILE).


One case should be, if there is any dir_item/index or inode_ref refers the
inode as parent, the type of that inode must be DIR.

However, currently btrfsck implement (inode_record only records
backref), we
are unable to search the inode_backref whose parent is given inode number.

[FIRST IMPLEMENT DESIGN]
My first thought is to implement an generic inode-relation structure,
recording parent ino, child ino, name and namelen, and restore the
structure
in a rbtree, not in the child/parent's list.

But I soon recognize that this is a perfect use case for relational
database,
as 'ino' as the primary key for INODE table,
('parent_ino', 'child_ino', 'name') as the primary key for INODE_REF table.

[CRAZY IDEA]
So why not using SQL to implement the btrfsck inode-record things?

With such crazy idea, it will be much much easier to do any iteration
from a
given ino, and with the already mature RDB implement, like sqlite3, we can
save hundreds of lines of codes implementing the rb-tree or list.

[PROS]
1. Easy to maintain
    Now we don't need to maintain the rbtree searching or list
iteration, but
    easy SQL lines and its wrapper.

2. Easy to extend
    If we need to record something more, like extents and its relation to
    inode, we only need to create 2 tables and several SQL and wrappers.

3. Reduced memory usage for HUGE fs.
    When metadata grows to several TB or even more, current rb-tree based
    implement may run short of memory since they are all stored in memory.
    But if use SQL, RDBMS like sqlite3 can restore things in either
memory or
    disk, which may hugely reduce the memory usage for huge btrfs.

    If not use existing RDBMS, we need to implement complicated memory
control
    system to manage memory in userland.

[CONS]
1. Heavy implement
    SQL hide the rb-tree or B+ tree implement but costs more memory(if not
    compressed) and CPU cycles, which will be slower than the simple
rb-tree
    implement even using lightweight RDBMS like sqlite3.

2. Heavy dependency
    If use it, btrfs-progs will include RDBMS as the make and runtime
    dependency.
    Such low level progs depend on high level programs like sqlite3 may
be very
    strange.

3. A lot of rework on existing codes.
    Even SQL is easier to maintain and extend, if we use it, we still
need to
    reimplement several hundreds or even thousands lines of code to
implement
    it, not to mention the regression tests.

4. Copyright
    Will it cause any copyright problem if using non-GPL RDBMS like
sqlite3 in
    GPLv2 btrfs-progs?

[NEED FEEDBACK]
Any feedback or discussion on the crazy idea is welcomed, since this may
needs
a lot of work, it definitely needs a lot review on the idea before it
comes to
codes.

So, I think this does a good job of highlighting one of the bigger issues with btrfsck when it is compared to ext* and/or xfs. Despite this being a problem, I really don't think using a rdbms is the way to fix it, both for reasons outlined in other responses, and because fsck should be as fast as possible when nothing is wrong with the fs.

smime.p7s
Description: S/MIME Cryptographic Signature

Re: Crazy idea of cleanup the inode_record btrfsck things with SQL?

Reply via email to